Dual-Thresholded Heatmap-Guided Proposal Clustering and Negative Certainty Supervision with Enhanced Base Network for Weakly Supervised Object Detection

Haoyu He; Lu Shi; Renhao Lu; Weizhe Zhang; Yuelin Guo; Zejun Wang; Zhiyuan Chen; Zitong Huang

arxiv: 2509.08289 · v2 · submitted 2025-09-10 · 💻 cs.CV

Dual-Thresholded Heatmap-Guided Proposal Clustering and Negative Certainty Supervision with Enhanced Base Network for Weakly Supervised Object Detection

Yuelin Guo , Haoyu He , Zhiyuan Chen , Zitong Huang , Renhao Lu , Lu Shi , Zejun Wang , Weizhe Zhang This is my paper

Pith reviewed 2026-05-18 17:28 UTC · model grok-4.3

classification 💻 cs.CV

keywords weakly supervised object detectionheatmap-guided proposal selectorpseudo ground truth boxesnegative certainty supervisionenhanced base networkproposal clusteringPASCAL VOCMS COCO

0 comments

The pith

The DANCE method for weakly supervised object detection generates pseudo ground truth boxes via dual thresholds on heatmaps, augments the base network with background class and heatmap pre-supervision, and applies negative certainty loss to

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes the DANCE framework to fix three persistent problems in weakly supervised object detection. Prior methods often produce pseudo ground truth boxes that either highlight only object parts or merge adjacent instances of the same class. The new approach first uses a heatmap-guided proposal selector with dual thresholds to pre-select proposals that cover full objects while keeping instances apart. It then builds an enhanced base network that adds explicit background class representation to each proposal and uses heatmaps to reduce the semantic gap between network branches. Finally it adds a negative certainty supervision loss on previously ignored proposals to speed up convergence instead of discarding them.

Core claim

The authors introduce DANCE, which consists of a heatmap-guided proposal selector (HGPS) that applies dual thresholds to heatmaps to pre-select proposals for pseudo GT boxes capable of capturing full object extent and distinguishing adjacent intra-class instances, a weakly supervised basic detection network (WSBDN) that augments each proposal with a background class and uses heatmaps for pre-supervision to bridge semantic gaps between matrices, and a negative certainty supervision (NCS) loss applied to ignored proposals to accelerate convergence. Extensive experiments on PASCAL VOC and MS COCO demonstrate effectiveness and superiority over prior WSOD methods.

What carries the argument

The heatmap-guided proposal selector (HGPS) algorithm that applies two distinct thresholds to heatmaps to pre-select proposals for generating pseudo ground truth boxes.

Load-bearing premise

Dual thresholds applied to heatmaps will reliably produce pseudo GT boxes that capture full object extent and separate adjacent intra-class instances without introducing new failure modes on the target datasets.

What would settle it

On a test set containing many overlapping objects of the same class, measure whether dual-threshold pseudo GT boxes separate instances correctly and whether mean average precision improves over single-threshold baselines.

Figures

Figures reproduced from arXiv: 2509.08289 by Haoyu He, Lu Shi, Renhao Lu, Weizhe Zhang, Yuelin Guo, Zejun Wang, Zhiyuan Chen, Zitong Huang.

**Figure 2.** Figure 2: The overview architecture of DTH-CP. HGPS: Given an image, category-specific heatmaps are first obtained through the heatmap extractor. Dual thresholds are then applied to generate tight bounding boxes, where proposals falling between the high and scaled low boxes are assigned to corresponding clusters as a pseudo-GT-box candidate set. During training, we select the top-scoring proposal within each cluster… view at source ↗

**Figure 3.** Figure 3: Visualizations of our proposed HGPS algorithm. (a) original image. (b) category-specific heatmap. (c) high threshold mask. (d) low threshold mask. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Subordinate relationship between tightest bounding boxes. Although hh [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Visual comparison results of OICR (top row) and DTH-CP (bottom row) on “person” category. Due to the complex shapes and visual patterns [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Performance of single threshold on Pascal VOC 2007. It demonstrates [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Ablation results of different high and low thresholds on Pascal VOC [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Effect of different box scaling factors on PASCAL VOC 2007. It [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Convergence curves with and without the classification-ignored loss on [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Results of the matrices in the process of WSDDN. (a) Desired results. [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗

**Figure 11.** Figure 11: More visualization results on Pascal VOC 2007 [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗

read the original abstract

Weakly supervised object detection (WSOD) has attracted significant attention in recent years, as it does not require box-level annotations. State-of-the-art methods generally adopt a multi-module network, which employs WSDDN as the multiple instance detection network module and uses multiple instance refinement modules to refine performance. However, these approaches suffer from three key limitations. First, existing methods tend to generate pseudo GT boxes that either focus only on discriminative parts, failing to capture the whole object, or cover the entire object but fail to distinguish between adjacent intra-class instances. Second, the foundational WSDDN architecture lacks a crucial background class representation for each proposal and exhibits a large semantic gap between its branches. Third, prior methods discard ignored proposals during optimization, leading to slow convergence. To address these challenges, we propose the Dual-thresholded heAtmap-guided proposal clustering and Negative Certainty supervision with Enhanced base network (DANCE) method for WSOD. Specifically, we first devise a heatmap-guided proposal selector (HGPS) algorithm, which utilizes dual thresholds on heatmaps to pre-select proposals, enabling pseudo GT boxes to both capture the full object extent and distinguish between adjacent intra-class instances. We then construct a weakly supervised basic detection network (WSBDN), which augments each proposal with a background class representation and uses heatmaps for pre-supervision to bridge the semantic gap between matrices. At last, we introduce a negative certainty supervision (NCS) loss on ignored proposals to accelerate convergence. Extensive experiments on the challenging PASCAL VOC and MS COCO datasets demonstrate the effectiveness and superiority of our method. Our code is publicly available at https://github.com/gyl2565309278/DANCE.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DANCE assembles dual-threshold heatmap proposal selection, background class augmentation in the base net, and negative supervision on ignored proposals into a practical WSOD tweak that claims gains on VOC and COCO, but the improvements stay incremental and rest on the assumption that initial heatmaps are already good enough to avoid propagating new errors.

read the letter

The main takeaway is that this work targets three concrete pain points in current WSOD pipelines: pseudo boxes that either miss object extent or merge adjacent instances, the lack of background modeling in WSDDN-style bases, and the practice of simply dropping ignored proposals. The authors respond with HGPS using dual thresholds on heatmaps, an enhanced WSBDN that adds a background class and heatmap pre-supervision, and an NCS loss that keeps ignored proposals in the optimization. They report better numbers on PASCAL VOC and MS COCO than prior refinement-based methods and release the code, which is useful for anyone who wants to plug the pieces in and check the claims themselves.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces the DANCE method for weakly supervised object detection to overcome three limitations of existing approaches based on WSDDN: generation of pseudo GT boxes that either capture only discriminative parts or fail to separate adjacent instances, absence of background class representation and semantic gaps in the base network, and slow convergence due to discarding ignored proposals. The proposed solution consists of a heatmap-guided proposal selector (HGPS) using dual thresholds, a weakly supervised basic detection network (WSBDN) with added background class and heatmap pre-supervision, and a negative certainty supervision (NCS) loss. Experiments on PASCAL VOC and MS COCO datasets are reported to show the method's effectiveness and superiority.

Significance. Should the central claims regarding improved pseudo GT generation and faster convergence hold, this could represent a useful incremental contribution to the WSOD literature by providing targeted fixes to common pipeline issues. The public availability of the code is noted as a strength for reproducibility in the field.

major comments (1)

The dual-thresholded heatmap-guided proposal clustering (HGPS) is presented as addressing the pseudo GT limitations. However, the choice and generalization of the dual thresholds across different object classes, scales, and densities is not sufficiently justified or ablated, which is load-bearing for the claim that it reliably captures full extent without introducing new failure modes.

minor comments (2)

Notation for the thresholds and loss terms could be more clearly defined in the method section to aid readability.
Consider adding specific performance metrics (e.g., mAP improvements) to the abstract for a more quantitative summary.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and for recognizing the potential contribution of our work. We address the major comment below and commit to revisions that strengthen the justification for our design choices.

read point-by-point responses

Referee: The dual-thresholded heatmap-guided proposal clustering (HGPS) is presented as addressing the pseudo GT limitations. However, the choice and generalization of the dual thresholds across different object classes, scales, and densities is not sufficiently justified or ablated, which is load-bearing for the claim that it reliably captures full extent without introducing new failure modes.

Authors: We agree that additional justification and empirical analysis of the dual thresholds would strengthen the manuscript. The lower threshold is intended to ensure broad coverage of object extent while the higher threshold enables separation of adjacent instances; these values were selected via grid search on the PASCAL VOC validation set to optimize mAP. In the revised version we will add a dedicated ablation table varying both thresholds independently, report results stratified by object class, scale, and instance density, and include qualitative examples illustrating behavior on challenging cases. This will directly support the generalization claim and rule out new failure modes. revision: yes

Circularity Check

0 steps flagged

No significant circularity: method components validated empirically on external benchmarks

full rationale

The paper proposes algorithmic modules (HGPS dual-threshold selector, WSBDN with background class and heatmap pre-supervision, NCS loss on ignored proposals) to address stated WSOD limitations. These are not derived from self-referential equations or fitted parameters renamed as predictions; instead, the central claims of improved pseudo-GT quality and convergence are supported by experiments on PASCAL VOC and MS COCO datasets. No load-bearing self-citations, uniqueness theorems, or ansatzes imported from prior author work appear in the provided text. The derivation chain consists of design choices whose performance is measured against independent data rather than reducing to the inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach relies on standard assumptions in WSOD literature plus a few method-specific choices whose values are not detailed in the abstract.

free parameters (1)

dual thresholds for heatmap proposal selection
Two thresholds are applied to heatmaps to pre-select proposals; their specific values are chosen to balance full-object coverage and instance separation.

axioms (1)

domain assumption Heatmaps generated from the network can serve as reliable pre-supervision signals to bridge semantic gaps between branches.
Invoked when constructing WSBDN to use heatmaps for pre-supervision.

pith-pipeline@v0.9.0 · 5879 in / 1340 out tokens · 31006 ms · 2026-05-18T17:28:51.384704+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we design Heatmap-Guided Proposal Selector (HGPS) algorithm, applying dual thresholds on heatmaps for proposal selection
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

negative certainty supervision (NCS) loss on ignored proposals

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

97 extracted references · 97 canonical work pages

[1]

The Pascal Visual Object Classes (VOC) Challenge,

M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The Pascal Visual Object Classes (VOC) Challenge,”Inter- national Journal of Computer Vision (IJCV), vol. 88, no. 2, pp. 303–338, Jun. 1, 2010

work page 2010
[2]

Microsoft COCO: Common Objects in Context,

T.-Y . Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll ´ar, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” inProceedings of the European Conference on Computer Vi- sion (ECCV), (Zurich, Switzerland, Sep. 6–12, 2014), ser. Lecture Notes in Computer Science, vol. 8693, Springer, Sep. 2014, pp. 740–755

work page 2014
[3]

ImageNet Large Scale Visual Recognition Challenge,

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei- Fei, “ImageNet Large Scale Visual Recognition Challenge,”Interna- tional Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, Dec. 1, 2015

work page 2015
[4]

Weakly Supervised Localiza- tion and Learning with Generic Knowledge,

T. Deselaers, B. Alexe, and V . Ferrari, “Weakly Supervised Localiza- tion and Learning with Generic Knowledge,”International Journal of Computer Vision (IJCV), vol. 100, no. 3, pp. 275–293, Dec. 1, 2012

work page 2012
[5]

Understanding the Difficulty of Training Deep Feedforward Neural Networks,

X. Glorot and Y . Bengio, “Understanding the Difficulty of Training Deep Feedforward Neural Networks,” inProceedings of the Thirteenth Inter- national Conference on Artificial Intelligence and Statistics (AISTATS), (Chia Laguna Resort, Sardinia, Italy, May 13–15, 2010), ser. JMLR Workshop and Conference Proceedings, vol. 9, JMLR.org, May 2010, pp. 249–256

work page 2010
[6]

Goodfellow, Y

I. Goodfellow, Y . Bengio, A. Courville, and Y . Bengio,Deep Learning (Adaptive Computation and Machine Learning). The MIT Press, 2016, vol. 1,ISBN: 978-0-262-03561-3

work page 2016
[7]

ImageNet Classifi- cation with Deep Convolutional Neural Networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classifi- cation with Deep Convolutional Neural Networks,” inAdvances in Neural Information Processing Systems (NeuIPS), (Lake Tahoe, NV , USA, Dec. 3–6, 2012), vol. 25, Curran Associates, Inc., Dec. 2012, pp. 1097–1105

work page 2012
[8]

Very Deep Convolutional Networks for Large-Scale Image Recognition,

K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” inThe Third International Conference on Learning Representations (ICLR), (San Diego, CA, USA, May 7–9, 2015), Computational and Biological Learning Society, May 2015

work page 2015
[9]

Deep Residual Learning for Image Recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 770–778

work page 2016
[10]

Feature Pyramid Networks for Object Detection,

T.-Y . Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 936–944

work page 2017
[11]

Attention is All You Need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All You Need,” inAdvances in Neural Information Processing Systems (NeuIPS), (Long Beach, CA, USA, Dec. 4–9, 2017), vol. 30, Curran Associates, Inc., Dec. 2017, pp. 5998–6008

work page 2017
[12]

An image is worth 16x16 words: Transformers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszko- reit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” inThe Ninth International Conference on Learning Representations (ICLR), (Virtual Event, May 3–7, 2021), OpenReview.n...

work page 2021
[13]

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Montreal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 9992–10 002

work page 2021
[14]

Multiscale Vision Transformers,

H. Fan, B. Xiong, K. Mangalam, Y . Li, Z. Yan, J. Malik, and C. Feichtenhofer, “Multiscale Vision Transformers,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Mon- treal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 6804–6815

work page 2021
[15]

Selective Search for Object Recognition,

J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. M. Smeulders, “Selective Search for Object Recognition,”International Journal of Computer Vision (IJCV), vol. 104, no. 2, pp. 154–171, Sep. 1, 2013

work page 2013
[16]

Edge Boxes: Locating Object Proposals from Edges,

C. L. Zitnick and P. Doll ´ar, “Edge Boxes: Locating Object Proposals from Edges,” inProceedings of the European Conference on Computer Vision (ECCV), (Zurich, Switzerland, Sep. 6–12, 2014), ser. Lecture Notes in Computer Science, vol. 8693, Springer, Sep. 2014, pp. 391– 405. 16 TABLE VIII COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2007TEST...

work page 2014
[17]

Multiscale Combinatorial Grouping,

P. A. Arbel ´aez, J. Pont-Tuset, J. T. Barron, F. Marqu ´es, and J. Malik, “Multiscale Combinatorial Grouping,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Columbus, OH, USA, Jun. 23–28, 2014), IEEE, Jun. 2014, pp. 328–335

work page 2014
[18]

Efficient Non-Maximum Suppression,

A. Neubeck and L. Van Gool, “Efficient Non-Maximum Suppression,” in18th International Conference on Pattern Recognition (ICPR), (Hong Kong, China, Aug. 20–24, 2006), vol. 3, IEEE, Aug. 2006, pp. 850–855

work page 2006
[19]

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Columbus, OH, USA, Jun. 23–28, 2014), IEEE, Jun. 2014, pp. 580–587

work page 2014
[20]

Fast R-CNN,

R. Girshick, “Fast R-CNN,” inProceedings of the IEEE International Conference on Computer Vision (ICCV), (Santiago, Chile, Dec. 7–13, 2015), IEEE, Dec. 2015, pp. 1440–1448

work page 2015
[21]

Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks,

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks,” inAdvances in Neural Information Processing Systems (NeuIPS), (Montreal, Quebec, Canada, Dec. 7–12, 2015), vol. 28, Curran Associates, Inc., Dec. 2015, pp. 91–99

work page 2015
[22]

SSD: Single Shot MultiBox Detector,

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y . Fu, and A. C. Berg, “SSD: Single Shot MultiBox Detector,” inProceedings of the European Conference on Computer Vision (ECCV), (Amsterdam, The Netherlands, Oct. 11–14, 2016), ser. Lecture Notes in Computer Science, vol. 9905, Springer, Sep. 17, 2016, pp. 21–37

work page 2016
[23]

You Only Look Once: Unified, Real-Time Object Detection,

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 779–788

work page 2016
[24]

Yolo9000: Better, faster, stronger,

J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 6517–6525

work page 2017
[25]

End-to-End Object Detection with Transformers,

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” in Proceedings of the European Conference on Computer Vision (ECCV), 17 TABLE IX COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2007TRAINVAL SET IN TERMS OFCORLOC(%). Methods aero bike bird boat bottle bus car cat chair ...

work page 2020
[26]

Deformable DETR: Deformable Transformers for End-to-End Object Detection,

X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” inThe 9th International Conference on Learning Representations (ICLR), (Virtual Event, May 3–7, 2021), OpenReview.net, May 2021

work page 2021
[27]

Conditional DETR for Fast Training Convergence,

D. Meng, X. Chen, Z. Fan, G. Zeng, H. Li, Y . Yuan, L. Sun, and J. Wang, “Conditional DETR for Fast Training Convergence,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Montreal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 3631– 3640

work page 2021
[28]

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR,

S. Liu, F. Li, H. Zhang, X. Yang, X. Qi, H. Su, J. Zhu, and L. Zhang, “DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR,” in The 10th International Conference on Learning Representations (ICLR), (Virtual Event, Apr. 25–29, 2022), OpenReview.net, Apr. 2022

work page 2022
[29]

DN- DETR: Accelerate DETR Training by Introducing Query DeNoising,

F. Li, H. Zhang, S. Liu, J. Guo, L. M. Ni, and L. Zhang, “DN- DETR: Accelerate DETR Training by Introducing Query DeNoising,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (New Orleans, LA, USA, Jun. 18–24, 2022), IEEE, Jun. 2022, pp. 13 609–13 617

work page 2022
[30]

Weakly Supervised Deep Detection Net- works,

H. Bilen and A. Vedaldi, “Weakly Supervised Deep Detection Net- works,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 2846–2854

work page 2016
[31]

Weakly Su- pervised Object Localization with Progressive Domain Adaptation,

D. Li, J.-B. Huang, Y . Li, S. Wang, and M.-H. Yang, “Weakly Su- pervised Object Localization with Progressive Domain Adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 3512–3520

work page 2016
[32]

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Local- ization,

V . Kantorov, M. Oquab, M. Cho, and I. Laptev, “ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Local- ization,” inProceedings of the European Conference on Computer Vision (ECCV), (Amsterdam, The Netherlands, Oct. 11–14, 2016), ser. Lecture Notes in Computer Science, vol. 9909, Springer, Sep. 16, 2016, pp. 350–365

work page 2016
[33]

Multiple Instance Detection Network with Online Instance Classifier Refinement,

P. Tang, X. Wang, X. Bai, and W. Liu, “Multiple Instance Detection Network with Online Instance Classifier Refinement,” inProceedings 18 TABLE X COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2012TEST SET IN TERMS OFAP (%). Methods aero bike bird boat bottle bus car cat chair cow table dog horse mbike person plant sheep sofa train tv mAPWSDDN [...

work page 2017
[34]

Deep Self-Taught Learning for Weakly Supervised Object Localization,

Z. Jie, Y . Wei, X. Jin, J. Feng, and W. Liu, “Deep Self-Taught Learning for Weakly Supervised Object Localization,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 4294–4302

work page 2017
[35]

Weakly Supervised Cascaded Convolutional Networks,

A. Diba, V . Sharma, A. Pazandeh, H. Pirsiavash, and L. Van Gool, “Weakly Supervised Cascaded Convolutional Networks,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 5131–5139

work page 2017
[36]

Collaborative Learning for Weakly Supervised Object Detection,

J. Wang, J. Yao, Y . Zhang, and R. Zhang, “Collaborative Learning for Weakly Supervised Object Detection,” inProceedings of the Twenty- Seventh International Joint Conference on Artificial Intelligence (IJCAI), (Stockholm, Sweden, Jul. 13–19, 2018), International Joint Conferences on Artificial Intelligence Organization, Jul. 2018, pp. 971–977

work page 2018
[37]

W2F: A Weakly- Supervised to Fully-Supervised Framework for Object Detection,

Y . Zhang, Y . Bai, M. Ding, Y . Li, and B. Ghanem, “W2F: A Weakly- Supervised to Fully-Supervised Framework for Object Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 928–936

work page 2018
[38]

Min-Entropy Latent Model for Weakly Supervised Object Detection,

F. Wan, P. Wei, J. Jiao, Z. Han, and Q. Ye, “Min-Entropy Latent Model for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 1297–1306

work page 2018
[39]

Zigzag Learning for Weakly Supervised Object Detection,

X. Zhang, J. Feng, H. Xiong, and Q. Tian, “Zigzag Learning for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 4262–4270

work page 2018
[40]

Generative Adversarial Learning Towards Fast Weakly Supervised Detection,

Y . Shen, R. Ji, S. Zhang, W. Zuo, and Y . Wang, “Generative Adversarial Learning Towards Fast Weakly Supervised Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 5764–5773

work page 2018
[41]

C-WSL: Count-Guided Weakly Supervised Localization,

M. Gao, A. Li, R. Yu, V . I. Morariu, and L. S. Davis, “C-WSL: Count-Guided Weakly Supervised Localization,” inProceedings of the 19 TABLE XI COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2012TRAINVAL SET IN TERMS OFCORLOC(%). Methods aero bike bird boat bottle bus car cat chair cow table dog horse mbike person plant sheep sofa train tv mCorLo...

work page 2018
[42]

ML-LocNet: Improving Object Localization with Multi-view Learning Network,

X. Zhang, Y . Yang, and J. Feng, “ML-LocNet: Improving Object Localization with Multi-view Learning Network,” inProceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany, Sep. 8–14, 2018), ser. Lecture Notes in Computer Science, vol. 11207, Springer, Oct. 7, 2018, pp. 248–263

work page 2018
[43]

Weakly Supervised Region Proposal Network and Object Detection,

P. Tang, X. Wang, A. Wang, Y . Yan, W. Liu, J. Huang, and A. Yuille, “Weakly Supervised Region Proposal Network and Object Detection,” in Proceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany, Sep. 8–14, 2018), ser. Lecture Notes in Computer Science, vol. 11215, Springer, Oct. 6, 2018, pp. 370–386

work page 2018
[44]

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection,

Y . Wei, Z. Shen, B. Cheng, H. Shi, J. Xiong, J. Feng, and T. Huang, “TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection,” inProceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany, Sep. 8–14, 2018), ser. Lecture Notes in Computer Science, vol. 11215, Springer, Oct. 6, 2018, pp. 454–470

work page 2018
[45]

Mixed Supervised Object Detection with Robust Objectness Transfer,

Y . Li, J. Zhang, K. Huang, and J. Zhang, “Mixed Supervised Object Detection with Robust Objectness Transfer,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 41, no. 3, pp. 639–653, Mar. 2019

work page 2019
[46]

PCL: Proposal Cluster Learning for Weakly Supervised Object Detec- tion,

P. Tang, X. Wang, S. Bai, W. Shen, X. Bai, W. Liu, and A. Yuille, “PCL: Proposal Cluster Learning for Weakly Supervised Object Detec- tion,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 42, no. 1, pp. 176–191, Jan. 2020

work page 2020
[47]

Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self- Paced Curriculum Learning Framework,

D. Zhang, J. Han, L. Zhao, and D. Meng, “Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self- Paced Curriculum Learning Framework,”International Journal of Com- puter Vision (IJCV), vol. 127, no. 4, pp. 363–380, Apr. 15, 2019

work page 2019
[48]

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation,

Y . Shen, R. Ji, Y . Wang, Y . Wu, and L. Cao, “Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019, pp. 697–707

work page 2019
[49]

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,

F. Wan, C. Liu, W. Ke, X. Ji, J. Jiao, and Q. Ye, “C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,” 20 Fig. 11. More visualization results on Pascal VOC 2007testset. 21 inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019,...

work page 2019
[50]

Dissimilarity Coefficient based Weakly Supervised Object Detection,

A. Arun, C. Jawahar, and M. P. Kumar, “Dissimilarity Coefficient based Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019, pp. 9424–9433

work page 2019
[51]

Utilizing the Instability in Weakly Supervised Object Detection,

B. Liu, Y . Gao, N. Guo, X. Ye, F. Wan, H. You, and D. Fan, “Utilizing the Instability in Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion Workshops (CVPRW), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019, pp. 11–20

work page 2019
[52]

Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation,

W. Ge, W. Huang, S. Guo, and M. R. Scott, “Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27– Nov. 2, 2019), IEEE, Oct. 2019, pp. 3344–3353

work page 2019
[53]

Object-Aware Instance La- beling for Weakly Supervised Object Detection,

S. Kosugi, T. Yamasaki, and K. Aizawa, “Object-Aware Instance La- beling for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Oct. 2019, pp. 6063–6071

work page 2019
[54]

WSOD 2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection,

Z. Zeng, B. Liu, J. Fu, H. Chao, and L. Zhang, “WSOD 2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection,” inProceedings of the IEEE/CVF International Con- ference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27– Nov. 2, 2019), IEEE, Nov. 2019, pp. 8291–8299

work page 2019
[55]

Towards Precise End-to-end Weakly Supervised Object Detection Network,

K. Yang, D. Li, and Y . Dou, “Towards Precise End-to-end Weakly Supervised Object Detection Network,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Nov. 2019, pp. 8371–8380

work page 2019
[56]

Weakly Supervised Object Detec- tion with Segmentation Collaboration,

X. Li, M. Kan, S. Shan, and X. Chen, “Weakly Supervised Object Detec- tion with Segmentation Collaboration,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Nov. 2019, pp. 9734–9743

work page 2019
[57]

C- MIDN: Coupled Multiple Instance Detection Network with Segmenta- tion Guidance for Weakly Supervised Object Detection,

Y . Gao, B. Liu, N. Guo, X. Ye, F. Wan, H. You, and D. Fan, “C- MIDN: Coupled Multiple Instance Detection Network with Segmenta- tion Guidance for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Nov. 2019, pp. 9833–9842

work page 2019
[58]

Progressive Repre- sentation Adaptation for Weakly Supervised Object Localization,

D. Li, J.-B. Huang, Y . Li, S. Wang, and M.-H. Yang, “Progressive Repre- sentation Adaptation for Weakly Supervised Object Localization,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 42, no. 6, pp. 1424–1438, Jun. 2020

work page 2020
[59]

Category-Aware Spatial Constraint for Weakly Supervised Detection,

Y . Shen, R. Ji, K. Yang, C. Deng, and C. Wang, “Category-Aware Spatial Constraint for Weakly Supervised Detection,”IEEE Transactions on Image Processing (TIP), vol. 29, pp. 843–858, 2020

work page 2020
[60]

High-Quality Proposals for Weakly Supervised Object Detection,

G. Cheng, J. Yang, D. Gao, L. Guo, and J. Han, “High-Quality Proposals for Weakly Supervised Object Detection,”IEEE Transactions on Image Processing (TIP), vol. 29, pp. 5794–5804, 2020

work page 2020
[61]

Object Instance Mining for Weakly Supervised Object Detection,

C. Lin, S. Wang, D. Xu, Y . Lu, and W. Zhang, “Object Instance Mining for Weakly Supervised Object Detection,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), (Hilton Midtown, NY , USA, Feb. 7–12, 2020), vol. 34, AAAI Press, Apr. 3, 2020, pp. 11 482–11 489

work page 2020
[62]

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection,

Z. Ren, Z. Yu, X. Yang, M.-Y . Liu, Y . J. Lee, A. G. Schwing, and J. Kautz, “Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Seat- tle, W A, USA, Jun. 13–19, 2020), IEEE, Jun. 2020, pp. 10 595–10 604

work page 2020
[63]

SLV: Spatial Likelihood V oting for Weakly Supervised Object Detection,

Z. Chen, Z. Fu, R. Jiang, Y . Chen, and X.-S. Hua, “SLV: Spatial Likelihood V oting for Weakly Supervised Object Detection,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Seattle, W A, USA, Jun. 13–19, 2020), IEEE, Jun. 2020, pp. 12 992–13 001

work page 2020
[64]

Distilling Knowledge from Refinement in Multiple Instance Detection Networks,

L. F. Zeni and C. R. Jung, “Distilling Knowledge from Refinement in Multiple Instance Detection Networks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (Seattle, W A, USA, Jun. 13–19, 2020), IEEE, Jun. 2020, pp. 3324–3333

work page 2020
[65]

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer,

Y . Zhong, J. Wang, J. Peng, and L. Zhang, “Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer,” inProceedings of the European Conference on Computer Vision (ECCV), (Glasgow, UK, Aug. 23–28, 2020), ser. Lecture Notes in Computer Science, vol. 12371, Springer, Nov. 13, 2020, pp. 615–631

work page 2020
[66]

Objectness Consistent Representation for Weakly Supervised Object Detection,

K. Yang, P. Zhang, P. Qiao, Z. Wang, D. Li, and Y . Dou, “Objectness Consistent Representation for Weakly Supervised Object Detection,” in Proceedings of the 28th ACM International Conference on Multimedia (ACM MM), (Virtual Event / Seattle, W A, USA, Oct. 12–16, 2020), Association for Computing Machinery, Oct. 12, 2020, pp. 1688–1696

work page 2020
[67]

UWSOD: Toward Fully- Supervised-Level Capacity Weakly Supervised Object Detection,

Y . Shen, R. Ji, Z. Chen, Y . Wu, and F. Huang, “UWSOD: Toward Fully- Supervised-Level Capacity Weakly Supervised Object Detection,” in Advances in Neural Information Processing Systems (NeuIPS), (Virtual Event, Dec. 6–12, 2020), vol. 33, Curran Associates, Inc., Dec. 2020, pp. 7005–7019

work page 2020
[68]

Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection,

Z. Huang, Y . Zou, B. V . K. V . Kumar, and D. Huang, “Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection,” in Advances in Neural Information Processing Systems (NeuIPS), (Virtual Event, Dec. 6–12, 2020), vol. 33, Curran Associates, Inc., Dec. 2020, pp. 16 797–16 807

work page 2020
[69]

WSODPB: Weakly Supervised Object Detection with PCSNet and Box Regression Module,

S. Yi, H. Ma, X. Li, and Y . Wang, “WSODPB: Weakly Supervised Object Detection with PCSNet and Box Regression Module,”Neuro- computing, vol. 418, pp. 232–240, Dec. 22, 2020

work page 2020
[70]

Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships,

D. Zhang, W. Zeng, J. Yao, and J. Han, “Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 6, pp. 3349–3363, Jun. 2022

work page 2022
[71]

Pyramidal Multiple Instance Detection Network with Mask Guided Self-Correction for Weakly Supervised Object Detection,

Y . Xu, C. Zhou, X. Yu, B. Xiao, and Y . Yang, “Pyramidal Multiple Instance Detection Network with Mask Guided Self-Correction for Weakly Supervised Object Detection,”IEEE Transactions on Image Processing (TIP), vol. 30, pp. 3029–3040, 2021

work page 2021
[72]

Gradingnet: Towards providing reliable supervisions for weakly supervised object detection by grading the box candidates,

Q. Jia, S. Wei, T. Ruan, Y . Zhao, and Y . Zhao, “Gradingnet: Towards providing reliable supervisions for weakly supervised object detection by grading the box candidates,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), (Virtual Event, Feb. 2–9, 2021), vol. 35, AAAI Press, May 18, 2021, pp. 1682–1690

work page 2021
[73]

Instance Mining with Class Feature Banks for Weakly Supervised Object Detection,

Y . Yin, J. Deng, W. Zhou, and H. Li, “Instance Mining with Class Feature Banks for Weakly Supervised Object Detection,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), (Virtual Event, Feb. 2–9, 2021), vol. 35, AAAI Press, May 18, 2021, pp. 3190–3198

work page 2021
[74]

Boosting Weakly Supervised Object Detection via Learning Bounding Box Ad- justers,

B. Dong, Z. Huang, Y . Guo, Q. Wang, Z. Niu, and W. Zuo, “Boosting Weakly Supervised Object Detection via Learning Bounding Box Ad- justers,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Montreal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 2856–2865

work page 2021
[75]

Multiple Instance Detec- tion Networks with Adaptive Instance Refinement,

Z. Wu, J. Wen, Y . Xu, J. Yang, and D. Zhang, “Multiple Instance Detec- tion Networks with Adaptive Instance Refinement,”IEEE Transactions on Multimedia (TMM), vol. 25, pp. 267–279, 2023

work page 2023
[76]

Discrepant Multiple Instance Learning for Weakly Supervised Object Detection,

W. Gao, F. Wan, J. Yue, S. Xu, and Q. Ye, “Discrepant Multiple Instance Learning for Weakly Supervised Object Detection,”Pattern Recognition (PR), vol. 122, p. 108 233, Feb. 2022

work page 2022
[77]

Salvage of Supervision in Weakly Su- pervised Object Detection,

L. Sui, C.-L. Zhang, and J. Wu, “Salvage of Supervision in Weakly Su- pervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (New Orleans, LA, USA, Jun. 18–24, 2022), IEEE, Jun. 2022, pp. 14 207–14 216

work page 2022
[78]

Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection via Negative Deterministic Information,

G. Wang, X. Zhang, Z. Peng, X. Tang, H. Zhou, and L. Jiao, “Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection via Negative Deterministic Information,” inProceedings of the Thirty- First International Joint Conference on Artificial Intelligence (IJCAI), (Vienna, Austria, Jul. 23–29, 2022), International Joint Conferences on Artificia...

work page 2022
[79]

End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution,

M. Liao, F. Wan, Y . Yao, Z. Han, J. Zou, Y . Wang, B. Feng, P. Yuan, and Q. Ye, “End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution,” inProceedings of the European Conference on Computer Vision (ECCV), (Tel Aviv, Israel, Oct. 23–27, 2022), ser. Lecture Notes in Computer Science, vol. 13669, Springer, Nov. 6, 2022, pp. 210–226

work page 2022
[80]

Object Discovery via Contrastive Learning for Weakly Supervised Object Detection,

J. Seo, W. Bae, D. J. Sutherland, J. Noh, and D. Kim, “Object Discovery via Contrastive Learning for Weakly Supervised Object Detection,” in Proceedings of the European Conference on Computer Vision (ECCV), (Tel Aviv, Israel, Oct. 23–27, 2022), ser. Lecture Notes in Computer Science, vol. 13691, Springer, Oct. 23, 2022, pp. 312–329

work page 2022

Showing first 80 references.

[1] [1]

The Pascal Visual Object Classes (VOC) Challenge,

M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The Pascal Visual Object Classes (VOC) Challenge,”Inter- national Journal of Computer Vision (IJCV), vol. 88, no. 2, pp. 303–338, Jun. 1, 2010

work page 2010

[2] [2]

Microsoft COCO: Common Objects in Context,

T.-Y . Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll ´ar, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” inProceedings of the European Conference on Computer Vi- sion (ECCV), (Zurich, Switzerland, Sep. 6–12, 2014), ser. Lecture Notes in Computer Science, vol. 8693, Springer, Sep. 2014, pp. 740–755

work page 2014

[3] [3]

ImageNet Large Scale Visual Recognition Challenge,

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei- Fei, “ImageNet Large Scale Visual Recognition Challenge,”Interna- tional Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, Dec. 1, 2015

work page 2015

[4] [4]

Weakly Supervised Localiza- tion and Learning with Generic Knowledge,

T. Deselaers, B. Alexe, and V . Ferrari, “Weakly Supervised Localiza- tion and Learning with Generic Knowledge,”International Journal of Computer Vision (IJCV), vol. 100, no. 3, pp. 275–293, Dec. 1, 2012

work page 2012

[5] [5]

Understanding the Difficulty of Training Deep Feedforward Neural Networks,

X. Glorot and Y . Bengio, “Understanding the Difficulty of Training Deep Feedforward Neural Networks,” inProceedings of the Thirteenth Inter- national Conference on Artificial Intelligence and Statistics (AISTATS), (Chia Laguna Resort, Sardinia, Italy, May 13–15, 2010), ser. JMLR Workshop and Conference Proceedings, vol. 9, JMLR.org, May 2010, pp. 249–256

work page 2010

[6] [6]

Goodfellow, Y

I. Goodfellow, Y . Bengio, A. Courville, and Y . Bengio,Deep Learning (Adaptive Computation and Machine Learning). The MIT Press, 2016, vol. 1,ISBN: 978-0-262-03561-3

work page 2016

[7] [7]

ImageNet Classifi- cation with Deep Convolutional Neural Networks,

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classifi- cation with Deep Convolutional Neural Networks,” inAdvances in Neural Information Processing Systems (NeuIPS), (Lake Tahoe, NV , USA, Dec. 3–6, 2012), vol. 25, Curran Associates, Inc., Dec. 2012, pp. 1097–1105

work page 2012

[8] [8]

Very Deep Convolutional Networks for Large-Scale Image Recognition,

K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” inThe Third International Conference on Learning Representations (ICLR), (San Diego, CA, USA, May 7–9, 2015), Computational and Biological Learning Society, May 2015

work page 2015

[9] [9]

Deep Residual Learning for Image Recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 770–778

work page 2016

[10] [10]

Feature Pyramid Networks for Object Detection,

T.-Y . Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 936–944

work page 2017

[11] [11]

Attention is All You Need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All You Need,” inAdvances in Neural Information Processing Systems (NeuIPS), (Long Beach, CA, USA, Dec. 4–9, 2017), vol. 30, Curran Associates, Inc., Dec. 2017, pp. 5998–6008

work page 2017

[12] [12]

An image is worth 16x16 words: Transformers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszko- reit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” inThe Ninth International Conference on Learning Representations (ICLR), (Virtual Event, May 3–7, 2021), OpenReview.n...

work page 2021

[13] [13]

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Montreal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 9992–10 002

work page 2021

[14] [14]

Multiscale Vision Transformers,

H. Fan, B. Xiong, K. Mangalam, Y . Li, Z. Yan, J. Malik, and C. Feichtenhofer, “Multiscale Vision Transformers,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Mon- treal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 6804–6815

work page 2021

[15] [15]

Selective Search for Object Recognition,

J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. M. Smeulders, “Selective Search for Object Recognition,”International Journal of Computer Vision (IJCV), vol. 104, no. 2, pp. 154–171, Sep. 1, 2013

work page 2013

[16] [16]

Edge Boxes: Locating Object Proposals from Edges,

C. L. Zitnick and P. Doll ´ar, “Edge Boxes: Locating Object Proposals from Edges,” inProceedings of the European Conference on Computer Vision (ECCV), (Zurich, Switzerland, Sep. 6–12, 2014), ser. Lecture Notes in Computer Science, vol. 8693, Springer, Sep. 2014, pp. 391– 405. 16 TABLE VIII COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2007TEST...

work page 2014

[17] [17]

Multiscale Combinatorial Grouping,

P. A. Arbel ´aez, J. Pont-Tuset, J. T. Barron, F. Marqu ´es, and J. Malik, “Multiscale Combinatorial Grouping,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Columbus, OH, USA, Jun. 23–28, 2014), IEEE, Jun. 2014, pp. 328–335

work page 2014

[18] [18]

Efficient Non-Maximum Suppression,

A. Neubeck and L. Van Gool, “Efficient Non-Maximum Suppression,” in18th International Conference on Pattern Recognition (ICPR), (Hong Kong, China, Aug. 20–24, 2006), vol. 3, IEEE, Aug. 2006, pp. 850–855

work page 2006

[19] [19]

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Columbus, OH, USA, Jun. 23–28, 2014), IEEE, Jun. 2014, pp. 580–587

work page 2014

[20] [20]

Fast R-CNN,

R. Girshick, “Fast R-CNN,” inProceedings of the IEEE International Conference on Computer Vision (ICCV), (Santiago, Chile, Dec. 7–13, 2015), IEEE, Dec. 2015, pp. 1440–1448

work page 2015

[21] [21]

Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks,

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks,” inAdvances in Neural Information Processing Systems (NeuIPS), (Montreal, Quebec, Canada, Dec. 7–12, 2015), vol. 28, Curran Associates, Inc., Dec. 2015, pp. 91–99

work page 2015

[22] [22]

SSD: Single Shot MultiBox Detector,

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y . Fu, and A. C. Berg, “SSD: Single Shot MultiBox Detector,” inProceedings of the European Conference on Computer Vision (ECCV), (Amsterdam, The Netherlands, Oct. 11–14, 2016), ser. Lecture Notes in Computer Science, vol. 9905, Springer, Sep. 17, 2016, pp. 21–37

work page 2016

[23] [23]

You Only Look Once: Unified, Real-Time Object Detection,

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 779–788

work page 2016

[24] [24]

Yolo9000: Better, faster, stronger,

J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 6517–6525

work page 2017

[25] [25]

End-to-End Object Detection with Transformers,

N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” in Proceedings of the European Conference on Computer Vision (ECCV), 17 TABLE IX COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2007TRAINVAL SET IN TERMS OFCORLOC(%). Methods aero bike bird boat bottle bus car cat chair ...

work page 2020

[26] [26]

Deformable DETR: Deformable Transformers for End-to-End Object Detection,

X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” inThe 9th International Conference on Learning Representations (ICLR), (Virtual Event, May 3–7, 2021), OpenReview.net, May 2021

work page 2021

[27] [27]

Conditional DETR for Fast Training Convergence,

D. Meng, X. Chen, Z. Fan, G. Zeng, H. Li, Y . Yuan, L. Sun, and J. Wang, “Conditional DETR for Fast Training Convergence,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Montreal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 3631– 3640

work page 2021

[28] [28]

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR,

S. Liu, F. Li, H. Zhang, X. Yang, X. Qi, H. Su, J. Zhu, and L. Zhang, “DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR,” in The 10th International Conference on Learning Representations (ICLR), (Virtual Event, Apr. 25–29, 2022), OpenReview.net, Apr. 2022

work page 2022

[29] [29]

DN- DETR: Accelerate DETR Training by Introducing Query DeNoising,

F. Li, H. Zhang, S. Liu, J. Guo, L. M. Ni, and L. Zhang, “DN- DETR: Accelerate DETR Training by Introducing Query DeNoising,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (New Orleans, LA, USA, Jun. 18–24, 2022), IEEE, Jun. 2022, pp. 13 609–13 617

work page 2022

[30] [30]

Weakly Supervised Deep Detection Net- works,

H. Bilen and A. Vedaldi, “Weakly Supervised Deep Detection Net- works,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 2846–2854

work page 2016

[31] [31]

Weakly Su- pervised Object Localization with Progressive Domain Adaptation,

D. Li, J.-B. Huang, Y . Li, S. Wang, and M.-H. Yang, “Weakly Su- pervised Object Localization with Progressive Domain Adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Las Vegas, NV , USA, Jun. 27–30, 2016), IEEE, Jun. 2016, pp. 3512–3520

work page 2016

[32] [32]

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Local- ization,

V . Kantorov, M. Oquab, M. Cho, and I. Laptev, “ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Local- ization,” inProceedings of the European Conference on Computer Vision (ECCV), (Amsterdam, The Netherlands, Oct. 11–14, 2016), ser. Lecture Notes in Computer Science, vol. 9909, Springer, Sep. 16, 2016, pp. 350–365

work page 2016

[33] [33]

Multiple Instance Detection Network with Online Instance Classifier Refinement,

P. Tang, X. Wang, X. Bai, and W. Liu, “Multiple Instance Detection Network with Online Instance Classifier Refinement,” inProceedings 18 TABLE X COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2012TEST SET IN TERMS OFAP (%). Methods aero bike bird boat bottle bus car cat chair cow table dog horse mbike person plant sheep sofa train tv mAPWSDDN [...

work page 2017

[34] [34]

Deep Self-Taught Learning for Weakly Supervised Object Localization,

Z. Jie, Y . Wei, X. Jin, J. Feng, and W. Liu, “Deep Self-Taught Learning for Weakly Supervised Object Localization,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 4294–4302

work page 2017

[35] [35]

Weakly Supervised Cascaded Convolutional Networks,

A. Diba, V . Sharma, A. Pazandeh, H. Pirsiavash, and L. Van Gool, “Weakly Supervised Cascaded Convolutional Networks,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (Honolulu, HI, USA, Jul. 21–26, 2017), IEEE, Jul. 2017, pp. 5131–5139

work page 2017

[36] [36]

Collaborative Learning for Weakly Supervised Object Detection,

J. Wang, J. Yao, Y . Zhang, and R. Zhang, “Collaborative Learning for Weakly Supervised Object Detection,” inProceedings of the Twenty- Seventh International Joint Conference on Artificial Intelligence (IJCAI), (Stockholm, Sweden, Jul. 13–19, 2018), International Joint Conferences on Artificial Intelligence Organization, Jul. 2018, pp. 971–977

work page 2018

[37] [37]

W2F: A Weakly- Supervised to Fully-Supervised Framework for Object Detection,

Y . Zhang, Y . Bai, M. Ding, Y . Li, and B. Ghanem, “W2F: A Weakly- Supervised to Fully-Supervised Framework for Object Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 928–936

work page 2018

[38] [38]

Min-Entropy Latent Model for Weakly Supervised Object Detection,

F. Wan, P. Wei, J. Jiao, Z. Han, and Q. Ye, “Min-Entropy Latent Model for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 1297–1306

work page 2018

[39] [39]

Zigzag Learning for Weakly Supervised Object Detection,

X. Zhang, J. Feng, H. Xiong, and Q. Tian, “Zigzag Learning for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 4262–4270

work page 2018

[40] [40]

Generative Adversarial Learning Towards Fast Weakly Supervised Detection,

Y . Shen, R. Ji, S. Zhang, W. Zuo, and Y . Wang, “Generative Adversarial Learning Towards Fast Weakly Supervised Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Salt Lake City, UT, USA, Jun. 18–23, 2018), IEEE, Jun. 2018, pp. 5764–5773

work page 2018

[41] [41]

C-WSL: Count-Guided Weakly Supervised Localization,

M. Gao, A. Li, R. Yu, V . I. Morariu, and L. S. Davis, “C-WSL: Count-Guided Weakly Supervised Localization,” inProceedings of the 19 TABLE XI COMPARISON WITH THE STATE-OF-THE-ART METHODS ONPASCAL VOC 2012TRAINVAL SET IN TERMS OFCORLOC(%). Methods aero bike bird boat bottle bus car cat chair cow table dog horse mbike person plant sheep sofa train tv mCorLo...

work page 2018

[42] [42]

ML-LocNet: Improving Object Localization with Multi-view Learning Network,

X. Zhang, Y . Yang, and J. Feng, “ML-LocNet: Improving Object Localization with Multi-view Learning Network,” inProceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany, Sep. 8–14, 2018), ser. Lecture Notes in Computer Science, vol. 11207, Springer, Oct. 7, 2018, pp. 248–263

work page 2018

[43] [43]

Weakly Supervised Region Proposal Network and Object Detection,

P. Tang, X. Wang, A. Wang, Y . Yan, W. Liu, J. Huang, and A. Yuille, “Weakly Supervised Region Proposal Network and Object Detection,” in Proceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany, Sep. 8–14, 2018), ser. Lecture Notes in Computer Science, vol. 11215, Springer, Oct. 6, 2018, pp. 370–386

work page 2018

[44] [44]

TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection,

Y . Wei, Z. Shen, B. Cheng, H. Shi, J. Xiong, J. Feng, and T. Huang, “TS2C: Tight Box Mining with Surrounding Segmentation Context for Weakly Supervised Object Detection,” inProceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany, Sep. 8–14, 2018), ser. Lecture Notes in Computer Science, vol. 11215, Springer, Oct. 6, 2018, pp. 454–470

work page 2018

[45] [45]

Mixed Supervised Object Detection with Robust Objectness Transfer,

Y . Li, J. Zhang, K. Huang, and J. Zhang, “Mixed Supervised Object Detection with Robust Objectness Transfer,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 41, no. 3, pp. 639–653, Mar. 2019

work page 2019

[46] [46]

PCL: Proposal Cluster Learning for Weakly Supervised Object Detec- tion,

P. Tang, X. Wang, S. Bai, W. Shen, X. Bai, W. Liu, and A. Yuille, “PCL: Proposal Cluster Learning for Weakly Supervised Object Detec- tion,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 42, no. 1, pp. 176–191, Jan. 2020

work page 2020

[47] [47]

Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self- Paced Curriculum Learning Framework,

D. Zhang, J. Han, L. Zhao, and D. Meng, “Leveraging Prior-Knowledge for Weakly Supervised Object Detection Under a Collaborative Self- Paced Curriculum Learning Framework,”International Journal of Com- puter Vision (IJCV), vol. 127, no. 4, pp. 363–380, Apr. 15, 2019

work page 2019

[48] [48]

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation,

Y . Shen, R. Ji, Y . Wang, Y . Wu, and L. Cao, “Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019, pp. 697–707

work page 2019

[49] [49]

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,

F. Wan, C. Liu, W. Ke, X. Ji, J. Jiao, and Q. Ye, “C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection,” 20 Fig. 11. More visualization results on Pascal VOC 2007testset. 21 inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019,...

work page 2019

[50] [50]

Dissimilarity Coefficient based Weakly Supervised Object Detection,

A. Arun, C. Jawahar, and M. P. Kumar, “Dissimilarity Coefficient based Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019, pp. 9424–9433

work page 2019

[51] [51]

Utilizing the Instability in Weakly Supervised Object Detection,

B. Liu, Y . Gao, N. Guo, X. Ye, F. Wan, H. You, and D. Fan, “Utilizing the Instability in Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion Workshops (CVPRW), (Long Beach, CA, USA, Jun. 15–20, 2019), IEEE, Jun. 2019, pp. 11–20

work page 2019

[52] [52]

Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation,

W. Ge, W. Huang, S. Guo, and M. R. Scott, “Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27– Nov. 2, 2019), IEEE, Oct. 2019, pp. 3344–3353

work page 2019

[53] [53]

Object-Aware Instance La- beling for Weakly Supervised Object Detection,

S. Kosugi, T. Yamasaki, and K. Aizawa, “Object-Aware Instance La- beling for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Oct. 2019, pp. 6063–6071

work page 2019

[54] [54]

WSOD 2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection,

Z. Zeng, B. Liu, J. Fu, H. Chao, and L. Zhang, “WSOD 2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection,” inProceedings of the IEEE/CVF International Con- ference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27– Nov. 2, 2019), IEEE, Nov. 2019, pp. 8291–8299

work page 2019

[55] [55]

Towards Precise End-to-end Weakly Supervised Object Detection Network,

K. Yang, D. Li, and Y . Dou, “Towards Precise End-to-end Weakly Supervised Object Detection Network,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Nov. 2019, pp. 8371–8380

work page 2019

[56] [56]

Weakly Supervised Object Detec- tion with Segmentation Collaboration,

X. Li, M. Kan, S. Shan, and X. Chen, “Weakly Supervised Object Detec- tion with Segmentation Collaboration,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Nov. 2019, pp. 9734–9743

work page 2019

[57] [57]

C- MIDN: Coupled Multiple Instance Detection Network with Segmenta- tion Guidance for Weakly Supervised Object Detection,

Y . Gao, B. Liu, N. Guo, X. Ye, F. Wan, H. You, and D. Fan, “C- MIDN: Coupled Multiple Instance Detection Network with Segmenta- tion Guidance for Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, Korea (South), Oct. 27–Nov. 2, 2019), IEEE, Nov. 2019, pp. 9833–9842

work page 2019

[58] [58]

Progressive Repre- sentation Adaptation for Weakly Supervised Object Localization,

D. Li, J.-B. Huang, Y . Li, S. Wang, and M.-H. Yang, “Progressive Repre- sentation Adaptation for Weakly Supervised Object Localization,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 42, no. 6, pp. 1424–1438, Jun. 2020

work page 2020

[59] [59]

Category-Aware Spatial Constraint for Weakly Supervised Detection,

Y . Shen, R. Ji, K. Yang, C. Deng, and C. Wang, “Category-Aware Spatial Constraint for Weakly Supervised Detection,”IEEE Transactions on Image Processing (TIP), vol. 29, pp. 843–858, 2020

work page 2020

[60] [60]

High-Quality Proposals for Weakly Supervised Object Detection,

G. Cheng, J. Yang, D. Gao, L. Guo, and J. Han, “High-Quality Proposals for Weakly Supervised Object Detection,”IEEE Transactions on Image Processing (TIP), vol. 29, pp. 5794–5804, 2020

work page 2020

[61] [61]

Object Instance Mining for Weakly Supervised Object Detection,

C. Lin, S. Wang, D. Xu, Y . Lu, and W. Zhang, “Object Instance Mining for Weakly Supervised Object Detection,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), (Hilton Midtown, NY , USA, Feb. 7–12, 2020), vol. 34, AAAI Press, Apr. 3, 2020, pp. 11 482–11 489

work page 2020

[62] [62]

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection,

Z. Ren, Z. Yu, X. Yang, M.-Y . Liu, Y . J. Lee, A. G. Schwing, and J. Kautz, “Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Seat- tle, W A, USA, Jun. 13–19, 2020), IEEE, Jun. 2020, pp. 10 595–10 604

work page 2020

[63] [63]

SLV: Spatial Likelihood V oting for Weakly Supervised Object Detection,

Z. Chen, Z. Fu, R. Jiang, Y . Chen, and X.-S. Hua, “SLV: Spatial Likelihood V oting for Weakly Supervised Object Detection,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (Seattle, W A, USA, Jun. 13–19, 2020), IEEE, Jun. 2020, pp. 12 992–13 001

work page 2020

[64] [64]

Distilling Knowledge from Refinement in Multiple Instance Detection Networks,

L. F. Zeni and C. R. Jung, “Distilling Knowledge from Refinement in Multiple Instance Detection Networks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (Seattle, W A, USA, Jun. 13–19, 2020), IEEE, Jun. 2020, pp. 3324–3333

work page 2020

[65] [65]

Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer,

Y . Zhong, J. Wang, J. Peng, and L. Zhang, “Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer,” inProceedings of the European Conference on Computer Vision (ECCV), (Glasgow, UK, Aug. 23–28, 2020), ser. Lecture Notes in Computer Science, vol. 12371, Springer, Nov. 13, 2020, pp. 615–631

work page 2020

[66] [66]

Objectness Consistent Representation for Weakly Supervised Object Detection,

K. Yang, P. Zhang, P. Qiao, Z. Wang, D. Li, and Y . Dou, “Objectness Consistent Representation for Weakly Supervised Object Detection,” in Proceedings of the 28th ACM International Conference on Multimedia (ACM MM), (Virtual Event / Seattle, W A, USA, Oct. 12–16, 2020), Association for Computing Machinery, Oct. 12, 2020, pp. 1688–1696

work page 2020

[67] [67]

UWSOD: Toward Fully- Supervised-Level Capacity Weakly Supervised Object Detection,

Y . Shen, R. Ji, Z. Chen, Y . Wu, and F. Huang, “UWSOD: Toward Fully- Supervised-Level Capacity Weakly Supervised Object Detection,” in Advances in Neural Information Processing Systems (NeuIPS), (Virtual Event, Dec. 6–12, 2020), vol. 33, Curran Associates, Inc., Dec. 2020, pp. 7005–7019

work page 2020

[68] [68]

Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection,

Z. Huang, Y . Zou, B. V . K. V . Kumar, and D. Huang, “Comprehensive Attention Self-Distillation for Weakly-Supervised Object Detection,” in Advances in Neural Information Processing Systems (NeuIPS), (Virtual Event, Dec. 6–12, 2020), vol. 33, Curran Associates, Inc., Dec. 2020, pp. 16 797–16 807

work page 2020

[69] [69]

WSODPB: Weakly Supervised Object Detection with PCSNet and Box Regression Module,

S. Yi, H. Ma, X. Li, and Y . Wang, “WSODPB: Weakly Supervised Object Detection with PCSNet and Box Regression Module,”Neuro- computing, vol. 418, pp. 232–240, Dec. 22, 2020

work page 2020

[70] [70]

Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships,

D. Zhang, W. Zeng, J. Yao, and J. Han, “Weakly Supervised Object Detection Using Proposal- and Semantic-Level Relationships,”IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 44, no. 6, pp. 3349–3363, Jun. 2022

work page 2022

[71] [71]

Pyramidal Multiple Instance Detection Network with Mask Guided Self-Correction for Weakly Supervised Object Detection,

Y . Xu, C. Zhou, X. Yu, B. Xiao, and Y . Yang, “Pyramidal Multiple Instance Detection Network with Mask Guided Self-Correction for Weakly Supervised Object Detection,”IEEE Transactions on Image Processing (TIP), vol. 30, pp. 3029–3040, 2021

work page 2021

[72] [72]

Gradingnet: Towards providing reliable supervisions for weakly supervised object detection by grading the box candidates,

Q. Jia, S. Wei, T. Ruan, Y . Zhao, and Y . Zhao, “Gradingnet: Towards providing reliable supervisions for weakly supervised object detection by grading the box candidates,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), (Virtual Event, Feb. 2–9, 2021), vol. 35, AAAI Press, May 18, 2021, pp. 1682–1690

work page 2021

[73] [73]

Instance Mining with Class Feature Banks for Weakly Supervised Object Detection,

Y . Yin, J. Deng, W. Zhou, and H. Li, “Instance Mining with Class Feature Banks for Weakly Supervised Object Detection,” inProceedings of the AAAI Conference on Artificial Intelligence (AAAI), (Virtual Event, Feb. 2–9, 2021), vol. 35, AAAI Press, May 18, 2021, pp. 3190–3198

work page 2021

[74] [74]

Boosting Weakly Supervised Object Detection via Learning Bounding Box Ad- justers,

B. Dong, Z. Huang, Y . Guo, Q. Wang, Z. Niu, and W. Zuo, “Boosting Weakly Supervised Object Detection via Learning Bounding Box Ad- justers,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Montreal, QC, Canada, Oct. 10–17, 2021), IEEE, Oct. 2021, pp. 2856–2865

work page 2021

[75] [75]

Multiple Instance Detec- tion Networks with Adaptive Instance Refinement,

Z. Wu, J. Wen, Y . Xu, J. Yang, and D. Zhang, “Multiple Instance Detec- tion Networks with Adaptive Instance Refinement,”IEEE Transactions on Multimedia (TMM), vol. 25, pp. 267–279, 2023

work page 2023

[76] [76]

Discrepant Multiple Instance Learning for Weakly Supervised Object Detection,

W. Gao, F. Wan, J. Yue, S. Xu, and Q. Ye, “Discrepant Multiple Instance Learning for Weakly Supervised Object Detection,”Pattern Recognition (PR), vol. 122, p. 108 233, Feb. 2022

work page 2022

[77] [77]

Salvage of Supervision in Weakly Su- pervised Object Detection,

L. Sui, C.-L. Zhang, and J. Wu, “Salvage of Supervision in Weakly Su- pervised Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (New Orleans, LA, USA, Jun. 18–24, 2022), IEEE, Jun. 2022, pp. 14 207–14 216

work page 2022

[78] [78]

Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection via Negative Deterministic Information,

G. Wang, X. Zhang, Z. Peng, X. Tang, H. Zhou, and L. Jiao, “Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection via Negative Deterministic Information,” inProceedings of the Thirty- First International Joint Conference on Artificial Intelligence (IJCAI), (Vienna, Austria, Jul. 23–29, 2022), International Joint Conferences on Artificia...

work page 2022

[79] [79]

End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution,

M. Liao, F. Wan, Y . Yao, Z. Han, J. Zou, Y . Wang, B. Feng, P. Yuan, and Q. Ye, “End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution,” inProceedings of the European Conference on Computer Vision (ECCV), (Tel Aviv, Israel, Oct. 23–27, 2022), ser. Lecture Notes in Computer Science, vol. 13669, Springer, Nov. 6, 2022, pp. 210–226

work page 2022

[80] [80]

Object Discovery via Contrastive Learning for Weakly Supervised Object Detection,

J. Seo, W. Bae, D. J. Sutherland, J. Noh, and D. Kim, “Object Discovery via Contrastive Learning for Weakly Supervised Object Detection,” in Proceedings of the European Conference on Computer Vision (ECCV), (Tel Aviv, Israel, Oct. 23–27, 2022), ser. Lecture Notes in Computer Science, vol. 13691, Springer, Oct. 23, 2022, pp. 312–329

work page 2022