Recognition: 2 theorem links
· Lean TheoremUniPCB: A Generation-Assisted Detection Framework for PCB Defect Inspection
Pith reviewed 2026-05-12 00:46 UTC · model grok-4.3
The pith
A joint generation-detection framework for PCBs uses multi-modal synthesis to augment scarce defect data and reach 98.0 percent mAP@0.5.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that a generation-assisted detection framework, with a Multi-modal Condition Generator feeding a ScaleEncoder and FiLM-style Condition Modulation for synthesis, plus an Inverted Residual Shift Attention and Cross-level Complementary Fusion Block for detection, jointly overcomes data scarcity and representation limits, delivering mAP@0.5 of 98.0 percent and mAP@0.5:0.95 of 61.8 percent on DsPCBSD+ while the generator reaches FID 129.61 and SSIM 0.619.
What carries the argument
The Multi-modal Condition Generator with ScaleEncoder and Condition Modulation that synthesizes aligned defects from parallel edge-depth-text inputs, paired with the detector's Inverted Residual Shift Attention and Cross-level Complementary Fusion Block that fuses global context and local texture via shift convolution and pixel-level gates.
If this is right
- Synthesized defects directly enrich the scarce IIoT training set, so gains in generation quality translate into higher detection mAP.
- The multi-modal conditioning enables structurally aligned samples that help the detector handle complex circuit backgrounds better than single-condition methods.
- The joint pipeline outperforms all compared detection and generation baselines on the DsPCBSD+ benchmark.
- The IIoT pipeline supports real-time inspection by addressing both data volume and feature extraction challenges in one system.
Where Pith is reading between the lines
- The same multi-modal conditioning strategy could be tested on other industrial inspection tasks where defect samples are rare, such as weld or fabric defect detection.
- Ablating the generation branch would show whether the attention and fusion blocks alone deliver part of the accuracy gain even without extra data.
- If domain shift between generated and real defects proves larger than reported, the framework might need additional adaptation steps for new PCB manufacturing lines.
Load-bearing premise
The synthesized defect samples must be realistic enough and distributionally close enough to real IIoT PCB images that adding them to the training set improves detection accuracy on actual data rather than introducing harmful artifacts or shift.
What would settle it
Training the detector on real samples alone versus real samples plus the generated ones and measuring mAP on a held-out set of real PCB defects; if the augmented version shows no gain or a drop, the core benefit of generation assistance is refuted.
Figures
read the original abstract
In the Industrial Internet of Things (IIoT), enabling intelligent, real-time Printed Circuit Board (PCB) defect inspection is critical for ensuring product reliability. However, existing IIoT-based visual inspection systems face two compounding challenges: scarce and imbalanced defect samples that limit model training, and insufficient feature representation under complex circuit backgrounds. Existing generation methods rely on single-modality conditions with coarse structural control, while detection methods improve architectures without addressing the data bottleneck. To resolve both challenges jointly, we propose a generation-assisted PCB defect inspection framework that integrates controlled defect synthesis with task-specific defect detection within an IIoT-enabled pipeline. On the generation side, a Multi-modal Condition Generator extracts complementary edge, depth, and text conditions in parallel. A ScaleEncoder then embeds these conditions into the diffusion U-Net at four resolutions, and a Condition Modulation applies FiLM-style spatially-adaptive modulation at each scale, enabling structurally aligned and defect-aware sample synthesis to augment the scarce IIoT dataset. On the detection side, an Inverted Residual Shift Attention couples self-attention with shift-wise convolution to jointly capture global context and local texture, and a Cross-level Complementary Fusion Block generates pixel-level gates for selective cross-level feature fusion. The synthesized samples directly enrich the detection training set, so that improvements in generation compound with improvements in detection. Extensive experiments on DsPCBSD+ demonstrate that UniPCB achieves mAP@0.5 of 98.0% and mAP@0.5:0.95 of 61.8% on defect detection, surpassing all compared methods, while the generation branch attains an FID of 129.61 and SSIM of 0.619, outperforming existing conditional generation approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes UniPCB, a generation-assisted framework for PCB defect inspection that integrates a Multi-modal Condition Generator (using parallel edge, depth, and text conditions fed via ScaleEncoder and Condition Modulation into a diffusion U-Net) for synthesizing defect samples to augment scarce IIoT data, with a detection network employing Inverted Residual Shift Attention and Cross-level Complementary Fusion for improved feature representation. On the DsPCBSD+ dataset, it reports mAP@0.5 of 98.0% and mAP@0.5:0.95 of 61.8% for detection (surpassing compared methods) alongside generation metrics of FID 129.61 and SSIM 0.619 (outperforming existing conditional generators), claiming that synthesized samples directly enrich training and compound with architectural improvements.
Significance. If the central claim holds, the work offers a practical pipeline for addressing data imbalance in industrial PCB inspection by jointly optimizing synthesis and detection, which could improve real-world IIoT reliability. The explicit reporting of both generation quality metrics and end-task mAP provides a basis for comparison, and the multi-modal conditioning approach is a concrete technical contribution. However, the lack of isolating experiments limits attribution of gains.
major comments (3)
- [Abstract / Experiments] Abstract and experiments section: The headline claim that 'the synthesized samples directly enrich the detection training set, so that improvements in generation compound with improvements in detection' is load-bearing for the generation-assisted framing, yet no ablation is described that trains the detection branch (Inverted Residual Shift Attention + Cross-level Complementary Fusion) on real DsPCBSD+ data only versus real + generated samples. Without this, the mAP@0.5 of 98.0% and mAP@0.5:0.95 of 61.8% cannot be attributed to the Multi-modal Condition Generator rather than the detection modules alone.
- [Abstract] Abstract: The reported mAP and generation metrics are presented without error bars, statistical significance tests (e.g., paired t-tests across runs), details on train/validation/test splits, or full baseline re-implementation protocols. This makes it impossible to assess whether the gains over compared methods are robust or sensitive to implementation choices.
- [Abstract / Methods] Generation branch description: The Multi-modal Condition Generator is said to produce 'structurally aligned and defect-aware' samples, but the abstract provides no quantitative measure of distributional alignment (e.g., feature-space distance to real defects) or qualitative failure cases, leaving the weakest assumption—that the FID 129.61 / SSIM 0.619 outputs avoid harmful domain shift—unverified.
minor comments (2)
- [Abstract] The abstract uses 'DsPCBSD+' without defining the dataset or citing its source; this should be clarified with a reference or brief description in the main text.
- [Methods] Notation for the ScaleEncoder and Condition Modulation (FiLM-style) is introduced without equations; adding a short mathematical formulation would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of experimental validation and reporting that we will address to strengthen the paper. Below we respond point-by-point to the major comments.
read point-by-point responses
-
Referee: [Abstract / Experiments] The headline claim that 'the synthesized samples directly enrich the detection training set, so that improvements in generation compound with improvements in detection' is load-bearing for the generation-assisted framing, yet no ablation is described that trains the detection branch (Inverted Residual Shift Attention + Cross-level Complementary Fusion) on real DsPCBSD+ data only versus real + generated samples. Without this, the mAP@0.5 of 98.0% and mAP@0.5:0.95 of 61.8% cannot be attributed to the Multi-modal Condition Generator rather than the detection modules alone.
Authors: We agree that an explicit ablation isolating the contribution of the synthesized samples is necessary to substantiate the generation-assisted claim. In the revised manuscript, we will add this ablation: training the full detection network (Inverted Residual Shift Attention + Cross-level Complementary Fusion) on real DsPCBSD+ data only, and comparing it directly to training on the combined real + generated set under identical hyperparameters and splits. This will quantify the mAP gains attributable to the Multi-modal Condition Generator. revision: yes
-
Referee: [Abstract] The reported mAP and generation metrics are presented without error bars, statistical significance tests (e.g., paired t-tests across runs), details on train/validation/test splits, or full baseline re-implementation protocols. This makes it impossible to assess whether the gains over compared methods are robust or sensitive to implementation choices.
Authors: We will revise the experiments section to report mean mAP values with standard deviations across multiple independent runs (e.g., 5 seeds), include paired t-tests or equivalent significance tests against baselines, explicitly state the train/validation/test split ratios and sampling strategy on DsPCBSD+, and provide complete re-implementation details (hyperparameters, data augmentation, and training schedules) for all compared methods to enable robust assessment of the gains. revision: yes
-
Referee: [Abstract / Methods] Generation branch description: The Multi-modal Condition Generator is said to produce 'structurally aligned and defect-aware' samples, but the abstract provides no quantitative measure of distributional alignment (e.g., feature-space distance to real defects) or qualitative failure cases, leaving the weakest assumption—that the FID 129.61 / SSIM 0.619 outputs avoid harmful domain shift—unverified.
Authors: FID and SSIM are established metrics for generation fidelity and structural similarity. To further verify distributional alignment and absence of harmful domain shift, the revised version will add quantitative analysis (e.g., average feature-space L2 distances using embeddings from a pre-trained ResNet on real vs. generated defect patches) and a dedicated qualitative section showing representative success cases alongside any observed failure modes (e.g., over-generated artifacts or misalignment). revision: yes
Circularity Check
No significant circularity; empirical framework with external validation
full rationale
The paper presents an architectural framework (Multi-modal Condition Generator with ScaleEncoder and Condition Modulation; Inverted Residual Shift Attention and Cross-level Complementary Fusion) whose performance claims rest on empirical metrics (mAP@0.5 98.0%, mAP@0.5:0.95 61.8%, FID 129.61, SSIM 0.619) measured on the external DsPCBSD+ dataset. No equations, derivations, or first-principles results are described that reduce by construction to fitted inputs, self-citations, or renamed patterns. The generation-assisted claim is presented as an empirical outcome rather than a tautological restatement of training objectives, satisfying the self-contained criterion.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Multi-modal conditions (edge, depth, text) can be extracted in parallel and embedded to produce structurally aligned defect images
- domain assumption Coupling self-attention with shift-wise convolution and cross-level gating improves feature representation under complex circuit backgrounds
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Multi-modal Condition Generator extracts complementary edge, depth, and text conditions... ScaleEncoder embeds these conditions into the diffusion U-Net at four resolutions, and a Condition Modulation applies FiLM-style spatially-adaptive modulation
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat induction and embed_strictMono unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Inverted Residual Shift Attention couples self-attention with shift-wise convolution... Cross-level Complementary Fusion Block generates pixel-level gates
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Review of vision- based defect detection research and its perspectives for printed circuit board,
Y . Zhou, M. Yuan, J. Zhang, G. Ding, and S. Qin, “Review of vision- based defect detection research and its perspectives for printed circuit board,”J. Manuf. Syst., vol. 70, pp. 557–578, 2023
work page 2023
-
[2]
A comprehensive review of research on surface defect detection of pcbs based on machine vision,
Z. He, Y . Lian, Y . Wang, and Z. Lu, “A comprehensive review of research on surface defect detection of pcbs based on machine vision,”Results Eng., p. 106437, 2025
work page 2025
-
[3]
Auto-encoding variational bayes,
D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” in Proc. ICLR, 2014
work page 2014
-
[4]
Neural discrete representation learning,
A. Van Den Oord, O. Vinyalset al., “Neural discrete representation learning,”Proc. NeurIPS, 2017
work page 2017
-
[5]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Proc. NeurIPS, 2014
work page 2014
-
[6]
Denoising diffusion probabilistic models,
J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Proc. NeurIPS, pp. 6840–6851, 2020
work page 2020
-
[7]
Denoising diffusion implicit models,
J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” inProc. ICLR, 2021
work page 2021
-
[8]
Score-based generative modeling through stochastic differ- ential equations,
Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differ- ential equations,” inProc. ICLR, 2021
work page 2021
-
[9]
Uniuir: Considering underwater image restoration as an all-in-one learner,
X. Zhang, H. Zhang, G. Wang, Q. Zhang, L. Zhang, and B. Du, “Uniuir: Considering underwater image restoration as an all-in-one learner,”IEEE Trans. Image Process., vol. 34, pp. 6963–6977, 2025
work page 2025
-
[10]
High- resolution image synthesis with latent diffusion models,
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” inProc. CVPR, 2022, pp. 10 684–10 695
work page 2022
-
[11]
X. Zhang, J. Huang, and L. Zhang, “Any2rsi: Controllable remote sens- ing text-to-image generation via any control and enriched description,” inProc. AAAI, 2026, pp. 12 852–12 860
work page 2026
-
[12]
Diffusion models in vision: A survey,
F.-A. Croitoru, V . Hondru, R. T. Ionescu, and M. Shah, “Diffusion models in vision: A survey,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 10, pp. 10 850–10 869, 2023
work page 2023
-
[13]
A survey on surface defect inspection based on generative models in manufacturing,
Y . He, S. Li, X. Wen, and J. Xu, “A survey on surface defect inspection based on generative models in manufacturing,”Appl. Sci., vol. 14, no. 15, p. 6774, 2024
work page 2024
-
[14]
End-to-end object detection with transformers,
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” inProc. ECCV, 2020, pp. 213–229
work page 2020
-
[15]
Detrs beat yolos on real-time object detection,
Y . Zhao, W. Lv, S. Xu, J. Wei, G. Wang, Q. Dang, Y . Liu, and J. Chen, “Detrs beat yolos on real-time object detection,” inProc. CVPR, 2024, pp. 16 965–16 974
work page 2024
-
[16]
Efficientdet: Scalable and efficient object detection,
M. Tan, R. Pang, and Q. V . Le, “Efficientdet: Scalable and efficient object detection,” inProc. CVPR, 2020, pp. 10 781–10 790
work page 2020
-
[17]
Research on pcb defect detection using artificial intelligence: a systematic mapping study,
D. I. Ural and A. Sezen, “Research on pcb defect detection using artificial intelligence: a systematic mapping study,”Evol. Intell., vol. 17, no. 5, pp. 3101–3111, 2024
work page 2024
-
[18]
A new contrastive gan with data augmentation for surface defect recognition under limited data,
Z. Du, L. Gao, and X. Li, “A new contrastive gan with data augmentation for surface defect recognition under limited data,”IEEE Trans. Instrum. Meas., vol. 72, pp. 1–13, 2022
work page 2022
-
[19]
Few-shot defect image generation via defect-aware feature manipulation,
Y . Duan, Y . Hong, L. Niu, and L. Zhang, “Few-shot defect image generation via defect-aware feature manipulation,” inProc. AAAI, 2023, pp. 571–578
work page 2023
-
[20]
Y . Liang, S. Feng, Y . Zhang, F. Xue, F. Shen, and J. Guo, “A stable diffusion enhanced yolov5 model for metal stamped part defect detection based on improved network structure,”J. Manuf. Process., vol. 111, pp. 21–31, 2024
work page 2024
-
[21]
Self-attention generative adversarial networks,
H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” inProc. ICML, 2019, pp. 7354–7363
work page 2019
-
[22]
Semantic image synthesis with spatially-adaptive normalization,
T. Park, M.-Y . Liu, T.-C. Wang, and J.-Y . Zhu, “Semantic image synthesis with spatially-adaptive normalization,” inProc. CVPR, 2019, pp. 2337–2346
work page 2019
-
[23]
Effective data augmentation with diffusion models,
B. Trabucco, K. Doherty, M. Gurinas, and R. Salakhutdinov, “Effective data augmentation with diffusion models,” inProc. ICLR, 2024
work page 2024
-
[24]
Anomalydiffusion: Few-shot anomaly image generation with diffusion model,
T. Hu, J. Zhang, R. Yi, Y . Du, X. Chen, L. Liu, Y . Wang, and C. Wang, “Anomalydiffusion: Few-shot anomaly image generation with diffusion model,” inProc. AAAI, 2024, pp. 8526–8534
work page 2024
-
[25]
W. Deng, L. Yan, and C. Wang, “Ddmf: a pcb surface defect detection model based on conditional denoising diffusion and multiscale feature fusion,”J. Supercomput., vol. 81, no. 15, pp. 1–33, 2025
work page 2025
-
[26]
C. Mou, X. Wang, L. Xie, Y . Wu, J. Zhang, Z. Qi, Y . Shan, and X. Qie, “T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models,” inProc. AAAI, 2024, pp. 4296–4304
work page 2024
-
[27]
Unicontrol: A unified diffusion model for controllable visual generation in the wild,
C. Qin, S. Bai, Y . Shen, L. Chen, B. Ni, J. Liu, Y . Liu, and X. Liu, “Unicontrol: A unified diffusion model for controllable visual generation in the wild,” inProc. NeurIPS, 2023
work page 2023
-
[28]
Circuit board welding defect detection based on industrial iovt,
C. Zhang, G. Shi, H. Li, M. Yang, Y . Li, Z. Bing, W. Chen, Z. Wang, F. Yu, and V . C. M. Leung, “Circuit board welding defect detection based on industrial iovt,”IEEE Internet Things J., vol. 13, no. 7, pp. 14 003–14 018, 2026
work page 2026
-
[29]
Yolo-hmc: An improved method for pcb surface defect detection,
M. Yuan, Y . Zhou, X. Ren, H. Zhi, J. Zhang, and H. Chen, “Yolo-hmc: An improved method for pcb surface defect detection,”IEEE Trans. Instrum. Meas., vol. 73, pp. 1–11, 2024
work page 2024
-
[30]
Reliable and lightweight adaptive convolution network for pcb surface defect detection,
L. Lei, H.-X. Li, and H.-D. Yang, “Reliable and lightweight adaptive convolution network for pcb surface defect detection,”IEEE Trans. Instrum. Meas., vol. 73, pp. 1–8, 2024
work page 2024
-
[31]
An adaptive defect-aware attention network for accurate pcb- defect detection,
X. Liu, “An adaptive defect-aware attention network for accurate pcb- defect detection,”IEEE Trans. Instrum. Meas., vol. 73, pp. 1–11, 2024
work page 2024
-
[32]
Internimage: Exploring large- scale vision foundation models with deformable convolutions,
W. Wang, J. Dai, Z. Chen, Z. Huang, Z. Li, X. Zhu, X. Hu, T. Lu, L. Lu, H. Li, X. Wang, and Y . Qiao, “Internimage: Exploring large- scale vision foundation models with deformable convolutions,” inProc. CVPR, 2023, pp. 14 408–14 419. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12
work page 2023
-
[33]
Sgt-yolo: A lightweight method for pcb defect detection,
C. Mo, Z. Hu, J. Wang, and X. Xiao, “Sgt-yolo: A lightweight method for pcb defect detection,”IEEE Trans. Instrum. Meas., vol. 74, pp. 1–11, 2025
work page 2025
-
[34]
Refined defect detector with deformable transformer and pyramid feature fusion for pcb detection,
T. Liu, G.-Z. Cao, Z. He, and S. Xie, “Refined defect detector with deformable transformer and pyramid feature fusion for pcb detection,” IEEE Trans. Instrum. Meas., vol. 73, pp. 1–11, 2023
work page 2023
-
[35]
Pcb-detr: A detection network of pcb surface defect with spatial attention offset module,
Q. Li, L. Wu, H. Xiao, and C. Huang, “Pcb-detr: A detection network of pcb surface defect with spatial attention offset module,”IEEE Access, vol. 12, pp. 158 436–158 445, 2024
work page 2024
-
[36]
Multi-granularity relation enhancement network for tiny defect detection on printed circuit board,
F. Guo, Z. Chen, B. Chen, M. Jing, and L. Zuo, “Multi-granularity relation enhancement network for tiny defect detection on printed circuit board,”IEEE Trans. Instrum. Meas., vol. 74, pp. 1–11, 2025
work page 2025
-
[37]
Dynamic head: Unifying object detection heads with attentions,
X. Dai, Y . Chen, B. Xiao, D. Chen, M. Liu, L. Yuan, and L. Zhang, “Dynamic head: Unifying object detection heads with attentions,” in Proc. CVPR, 2021, pp. 7373–7382
work page 2021
-
[38]
A pcb defect detector based on coordinate feature refinement,
J. Yang, Z. Liu, W. Du, and S. Zhang, “A pcb defect detector based on coordinate feature refinement,”IEEE Trans. Instrum. Meas., vol. 72, pp. 1–10, 2023
work page 2023
-
[39]
Mrc-detr: An adaptive multi-residual coupled transformer for bare board pcb defect detection,
J. Cao, H. Wu, X. Zhang, L. Tan, and H. Zhang, “Mrc-detr: An adaptive multi-residual coupled transformer for bare board pcb defect detection,” arXiv preprint arXiv:2507.03386, 2025
-
[40]
Ms-detr: a real-time multi- scale detection transformer for pcb defect detection,
L. Ji, C. Huang, H. Li, W. Han, and L. Yi, “Ms-detr: a real-time multi- scale detection transformer for pcb defect detection,”Signal Image Video Process., vol. 19, no. 3, p. 203, 2025
work page 2025
-
[41]
Adding conditional control to text-to-image diffusion models,
L. Zhang, A. Rao, and M. Agrawala, “Adding conditional control to text-to-image diffusion models,” inProc. ICCV, 2023, pp. 3836–3847
work page 2023
-
[42]
A computational approach to edge detection,
J. Canny, “A computational approach to edge detection,”IEEE Trans. Pattern Anal. Mach. Intell., no. 6, pp. 679–698, 2009
work page 2009
-
[43]
A threshold selection method from gray-level his- tograms,
N. Otsuet al., “A threshold selection method from gray-level his- tograms,”Automatica, vol. 11, no. 285-296, pp. 23–27, 1975
work page 1975
-
[44]
L. Yang, B. Kang, Z. Huang, Z. Zhao, X. Xu, J. Feng, and H. Zhao, “Depth anything v2,” inProc. NeurIPS, 2024, pp. 21 875–21 911
work page 2024
-
[45]
G. Jocher and J. Qiu, “Ultralytics yolo11,” 2024. [Online]. Available: https://github.com/ultralytics/ultralytics
work page 2024
-
[46]
Uni-controlnet: All-in-one control to text-to-image diffusion models,
S. Zhao, D. Chen, Y .-C. Chen, J. Bao, S. Hao, L. Yuan, and K.-Y . K. Wong, “Uni-controlnet: All-in-one control to text-to-image diffusion models,”Proc. NeurIPS, pp. 11 127–11 150, 2023
work page 2023
-
[47]
R. Sunkara and T. Luo, “No more strided convolutions or pooling: A new cnn building block for low-resolution images and small objects,” inProc. ECML-PKDD, 2022, pp. 443–459
work page 2022
-
[48]
Film: Visual reasoning with a general conditioning layer,
E. Perez, F. Strub, H. De Vries, V . Dumoulin, and A. Courville, “Film: Visual reasoning with a general conditioning layer,” inProc. AAAI, vol. 32, no. 1, 2018
work page 2018
-
[49]
Diffusion models beat gans on image synthesis,
P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” inProc. NeurIPS, 2021, pp. 8780–8794
work page 2021
-
[50]
Shiftwiseconv: Small convolutional kernel with large kernel effect,
D. Li, L. Li, Z. Chen, and J. Li, “Shiftwiseconv: Small convolutional kernel with large kernel effect,” inProc. CVPR, 2025, pp. 25 281–25 291
work page 2025
-
[51]
XCiT: Cross-covariance image transformers,
A. Ali, H. Touvron, M. Caron, P. Bojanowski, M. Douze, A. Joulin, I. Laptev, N. Neverova, G. Synnaeve, J. Verbeek, and H. J ´egou, “XCiT: Cross-covariance image transformers,” inProc. NeurIPS, 2021, pp. 20 014–20 027
work page 2021
-
[52]
A dataset for deep learning based detection of printed circuit board surface defect,
S. Lv, B. Ouyang, Z. Deng, T. Liang, S. Jiang, K. Zhang, J. Chen, and Z. Li, “A dataset for deep learning based detection of printed circuit board surface defect,”Sci. Data, vol. 11, no. 1, p. 811, 2024
work page 2024
-
[53]
Hripcb: a challenging dataset for pcb defects detection and classification,
W. Huang, P. Wei, M. Zhang, and H. Liu, “Hripcb: a challenging dataset for pcb defects detection and classification,”The Journal of Engineering, vol. 2020, no. 13, pp. 303–309, 2020
work page 2020
-
[54]
Online pcb defect detector on a new pcb defect dataset,
S. Tang, F. He, X. Huang, and J. Yang, “Online pcb defect detector on a new pcb defect dataset,”arXiv preprint arXiv:1902.06197, 2019
-
[55]
Gans trained by a two time-scale update rule converge to a local nash equilibrium,
M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” inProc. NeurIPS, 2017
work page 2017
-
[56]
The unreasonable effectiveness of deep features as a perceptual metric,
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proc. CVPR, 2018, pp. 586–595
work page 2018
-
[57]
Image quality assessment: From error visibility to structural similarity,
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,”IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004
work page 2004
-
[58]
Deformable detr: Deformable transformers for end-to-end object detection,
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable detr: Deformable transformers for end-to-end object detection,” inProc. ICLR, 2021
work page 2021
-
[59]
Dab-detr: Dynamic anchor boxes are better queries for detr,
S. Liu, F. Li, H. Zhang, X. Yang, X. Qi, H. Su, J. Zhu, and L. Zhang, “Dab-detr: Dynamic anchor boxes are better queries for detr,” inProc. ICLR, 2022
work page 2022
-
[60]
Dino: Detr with improved denoising anchor boxes for end-to- end object detection,
H. Zhang, F. Li, S. Liu, L. Zhang, H. Su, J. Zhu, L. M. Ni, and H.-Y . Shum, “Dino: Detr with improved denoising anchor boxes for end-to- end object detection,” inProc. ICLR, 2023
work page 2023
-
[61]
D-fine: Redefine regression task of detrs as fine-grained distribution refinement,
Y . Peng, H. Li, P. Wu, Y . Zhang, X. Sun, and F. Wu, “D-fine: Redefine regression task of detrs as fine-grained distribution refinement,” inProc. ICLR, 2025
work page 2025
-
[62]
Deim: Detr with improved matching for fast convergence,
S. Huang, Z. Lu, X. Cun, Y . Yu, X. Zhou, and X. Shen, “Deim: Detr with improved matching for fast convergence,” inProc. CVPR, 2025, pp. 15 162–15 171
work page 2025
-
[63]
G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics yolov8,” 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
work page 2023
-
[64]
Yolov10: Real-time end-to-end object detection,
A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, and G. Ding, “Yolov10: Real-time end-to-end object detection,” inProc. NeurIPS, 2024, pp. 107 984–108 011
work page 2024
-
[65]
Anycontrol: create your artwork with versatile control on text-to-image generation,
Y . Sun, Y . Liu, Y . Tang, W. Pei, and K. Chen, “Anycontrol: create your artwork with versatile control on text-to-image generation,” inProc. ECCV, 2024, pp. 92–109
work page 2024
-
[66]
Squeeze-and-excitation networks,
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” inProc. CVPR, 2018, pp. 7132–7141
work page 2018
-
[67]
Coordinate attention for efficient mobile network design,
Q. Hou, D. Zhou, and J. Feng, “Coordinate attention for efficient mobile network design,” inProc. CVPR, 2021, pp. 13 713–13 722
work page 2021
-
[68]
Simam: A simple, parameter- free attention module for convolutional neural networks,
L. Yang, R.-Y . Zhang, L. Li, and X. Xie, “Simam: A simple, parameter- free attention module for convolutional neural networks,” inProc. ICML, 2021, pp. 11 863–11 874
work page 2021
-
[69]
Efficient multi-scale attention module with cross-spatial learning,
D. Ouyang, S. He, G. Zhang, M. Luo, H. Guo, J. Zhan, and Z. Huang, “Efficient multi-scale attention module with cross-spatial learning,” in Proc. ICASSP, 2023, pp. 1–5
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.