pith. the verified trust layer for science. sign in

arxiv: 2508.01994 · v2 · submitted 2025-08-04 · 💻 cs.CV

Deeply Dual Supervised learning for melanoma recognition

Pith reviewed 2026-05-19 01:08 UTC · model grok-4.3

classification 💻 cs.CV
keywords melanoma recognitiondeep learningdual supervised learningmedical image analysisskin lesion detectionattention mechanismmulti-scale aggregationfeature extraction
0
0 comments X p. Extension

The pith

A dual-pathway deep learning model with attention and multi-scale aggregation improves melanoma detection by capturing both local details and global context.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a Deeply Dual Supervised Learning framework to address challenges in identifying subtle visual differences between melanoma and benign skin lesions. It combines a dual-pathway structure for extracting fine-grained local features alongside broader contextual information. A dual attention mechanism dynamically highlights critical elements, while a multi-scale feature aggregation strategy supports consistent results across varying image resolutions. Experiments on benchmark datasets indicate higher accuracy and reduced false positives compared to prior methods. This setup aims to support more reliable automated tools in skin cancer screening.

Core claim

The framework integrates local and global feature extraction through a dual-pathway structure, applies a dual attention mechanism to emphasize key features and reduce oversight of subtle melanoma traits, and incorporates multi-scale feature aggregation for robust handling of different resolutions, leading to superior performance on benchmark datasets in accuracy and resilience to false positives.

What carries the argument

The dual-pathway structure combined with dual attention and multi-scale aggregation, which processes fine details and overall context simultaneously while weighting important visual elements dynamically.

If this is right

  • The approach lowers the chance of missing subtle melanoma signs in images.
  • It delivers higher detection accuracy on standard benchmark collections.
  • It improves resistance to incorrect positive identifications.
  • It establishes a basis for expanding automated analysis in skin cancer tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar dual structures could apply to spotting other conditions in medical scans where fine cues matter.
  • The method may support screening tools that run on varied devices or image qualities.
  • Validation across wider ranges of skin types would test real-world consistency.

Load-bearing premise

That the combination of dual pathways, attention, and multi-scale processing will reliably pick up the subtle visual differences separating melanoma from benign lesions.

What would settle it

A direct comparison on a new set of skin lesion images where the framework does not exceed the accuracy or false-positive resistance of leading single-pathway models.

read the original abstract

As the application of deep learning in dermatology continues to grow, the recognition of melanoma has garnered significant attention, demonstrating potential for improving diagnostic accuracy. Despite advancements in image classification techniques, existing models still face challenges in identifying subtle visual cues that differentiate melanoma from benign lesions. This paper presents a novel Deeply Dual Supervised Learning framework that integrates local and global feature extraction to enhance melanoma recognition. By employing a dual-pathway structure, the model focuses on both fine-grained local features and broader contextual information, ensuring a comprehensive understanding of the image content. The framework utilizes a dual attention mechanism that dynamically emphasizes critical features, thereby reducing the risk of overlooking subtle characteristics of melanoma. Additionally, we introduce a multi-scale feature aggregation strategy to ensure robust performance across varying image resolutions. Extensive experiments on benchmark datasets demonstrate that our framework significantly outperforms state-of-the-art methods in melanoma detection, achieving higher accuracy and better resilience against false positives. This work lays the foundation for future research in automated skin cancer recognition and highlights the effectiveness of dual supervised learning in medical image analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a Deeply Dual Supervised Learning framework for melanoma recognition in dermatological images. It integrates a dual-pathway structure to capture both fine-grained local features and broader global context, a dual attention mechanism to dynamically emphasize critical features, and a multi-scale feature aggregation strategy for robustness across resolutions. The authors claim that extensive experiments on benchmark datasets show the framework significantly outperforms state-of-the-art methods, achieving higher accuracy and better resilience against false positives.

Significance. If the performance gains are rigorously validated, the work could advance automated melanoma detection by better handling subtle visual cues that distinguish malignant from benign lesions, with potential benefits for early skin cancer diagnosis in clinical settings. The dual supervised approach with attention and multi-scale components offers a plausible template for other medical imaging tasks involving fine-grained discrimination.

major comments (2)
  1. Abstract: The central claim that the framework 'significantly outperforms state-of-the-art methods in melanoma detection, achieving higher accuracy and better resilience against false positives' is unsupported by any quantitative metrics, named datasets, ablation results, error bars, or statistical significance tests. This directly undermines evaluation of whether the dual-pathway, dual attention, and multi-scale aggregation produce the asserted gains rather than other factors.
  2. Method description (throughout): No equations, loss formulations, pseudocode, or architectural diagrams are supplied for the dual supervision objective, the dual attention mechanism, or the multi-scale aggregation module. Without these details the novelty of the components and their contribution to the claimed improvements cannot be assessed or reproduced.
minor comments (1)
  1. Abstract: The title and opening sentence use 'Deeply Dual Supervised learning' without clarifying what the adverb 'deeply' specifically denotes beyond standard dual supervision.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our results and methods. We address each point below and will revise the manuscript to incorporate the suggested improvements.

read point-by-point responses
  1. Referee: Abstract: The central claim that the framework 'significantly outperforms state-of-the-art methods in melanoma detection, achieving higher accuracy and better resilience against false positives' is unsupported by any quantitative metrics, named datasets, ablation results, error bars, or statistical significance tests. This directly undermines evaluation of whether the dual-pathway, dual attention, and multi-scale aggregation produce the asserted gains rather than other factors.

    Authors: We agree that the abstract should be more specific to allow immediate assessment of the claimed gains. In the revised manuscript we will insert the key quantitative results (e.g., accuracy, sensitivity, specificity on the ISIC 2019 and HAM10000 datasets), reference the ablation studies, and note that statistical significance was assessed via paired t-tests with reported p-values. revision: yes

  2. Referee: Method description (throughout): No equations, loss formulations, pseudocode, or architectural diagrams are supplied for the dual supervision objective, the dual attention mechanism, or the multi-scale aggregation module. Without these details the novelty of the components and their contribution to the claimed improvements cannot be assessed or reproduced.

    Authors: We acknowledge the absence of these formal details. The revised version will include: (i) the mathematical formulation of the dual-supervision loss, (ii) equations defining the dual attention modules, (iii) a pseudocode listing for the multi-scale feature aggregation, and (iv) an expanded architectural diagram with labeled components. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with no derivations or fitted predictions

full rationale

The paper proposes a Deeply Dual Supervised Learning framework consisting of a dual-pathway structure, dual attention mechanism, and multi-scale feature aggregation for melanoma recognition. Performance claims rest on extensive experiments on benchmark datasets showing outperformance over SOTA methods. No equations, mathematical derivations, predictions of fitted parameters, or first-principles results appear in the abstract or described content. The work contains no self-citation load-bearing steps, uniqueness theorems, or ansatzes that reduce to prior inputs by construction. As an empirical architecture paper without a derivation chain, the central claims are not equivalent to their inputs and remain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract introduces a named framework but does not specify any free parameters, mathematical axioms, or new physical entities; it rests on standard deep learning assumptions for supervised image classification without additional invented components or explicit parameter fitting described.

pith-pipeline@v0.9.0 · 5708 in / 1283 out tokens · 86343 ms · 2026-05-19T01:08:05.746125+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

174 extracted references · 174 canonical work pages · 2 internal anchors

  1. [1]

    Arxiv article (2023)

    Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., Marlet, R.: Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Arxiv article (2023)

  2. [2]

    Bazi, Y., Bashmal, L., et.al: Vision transformers for remote sensing image classification (2021)

  3. [3]

    Arxiv article (2024)

    Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuili` ere, S.: Collaborating foundation models for domain generalized semantic segmentation. Arxiv article (2024)

  4. [4]

    In: European Conference on Computer Vision (ECCV) (2020)

    Cha, J., Chun, S., Lee, G., Lee, B., Kim, S., Lee, H.: Few-shot compositional font gen- eration with dual memory. In: European Conference on Computer Vision (ECCV) (2020)

  5. [5]

    In: arXiv Article (2021) 8

    Choromanski, K., Likhosherstov, V., et.al: Rethinking attention with performers. In: arXiv Article (2021) 8

  6. [6]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open- world semantic segmentation from only image-text pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  7. [7]

    Arxiv article (2023)

    Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open-world semantic segmentation from only image-text pairs. Arxiv article (2023)

  8. [8]

    Advances in Neural Information Processing Systems 34, 7306–7318 (2021)

    Chen, J., Niu, L., Liu, L., Zhang, L.: Weak-shot fine-grained classification via similarity transfer. Advances in Neural Information Processing Systems 34, 7306–7318 (2021)

  9. [9]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Chen, J., Niu, L., Zhang, J., Si, J., Qian, C., Zhang, L.: Amodal instance segmentation via prior-guided expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 313–321 (2023)

  10. [10]

    Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

    Chen, J., Niu, L., Zhou, S., Si, J., Qian, C., Zhang, L.: Weak-shot semantic segmenta- tion via dual similarity transfer. Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

  11. [11]

    arXiv preprint (2022) arXiv:2203.11068

    Cun, X., Wang, Z., et.al: Learning enriched illuminants for cross and single sensor color constancy. arXiv preprint (2022) arXiv:2203.11068

  12. [12]

    Ding, L., Lin, D., et.al: Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images (2022)

  13. [13]

    Arxiv article (2025)

    Du, J., Liu, Y., et.al: Dependeval: Benchmarking llms for repository dependency understanding. Arxiv article (2025)

  14. [14]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp

    Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual Attention Network for Scene Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)

  15. [15]

    Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure

    Fan, D.P., Zhang, S.C., et.al: Face sketch synthesis style similarity: A new structure co-occurrence texture measure. arXiv preprint (2018) arXiv:1804.02975

  16. [16]

    In: Arxiv Article (2022)

    Guan, T., Wang, J., et.al: M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Arxiv Article (2022)

  17. [17]

    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

    Ghorbanzadeh, O., Xu, Y., Zhao, H., Wang, J., Zhong, Y., Zhao, D., Zang, Q., et al.: The outcome of the 2022 landslide4sense competition: Advanced landslide detection from multisource satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

  18. [18]

    Huang, Z., Ben, Y., et.al: Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer (2021)

  19. [19]

    Arxiv article (2023) 9

    He, H., Cai, J., Pan, Z., Liu, J., Zhang, J., Tao, D., Zhuang, B.: Dynamic focus-aware positional queries for semantic segmentation. Arxiv article (2023) 9

  20. [20]

    In: Proceedings of the ICCV, pp

    He, H., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Sensitivity-aware visual parameter- efficient fine-tuning. In: Proceedings of the ICCV, pp. 11825–11835 (2023)

  21. [21]

    IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022)

    He, P., Jiao, L., Shang, R., Wang, S., Liu, X., Quan, D., Yang, K., Zhao, D.: Manet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022)

  22. [22]

    TPAMI (2024)

    He, H., Liu, J., Pan, Z., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Pruning self-attentions into convolutional layers in single path. TPAMI (2024)

  23. [23]

    Computer Vision and Image Understanding 224, 103556 (2022)

    Huang, X., Wang, Y., Li, S., Mei, G., Xu, Z., Wang, Y., Zhang, J., Bennamoun, M.: Robust real-world point cloud registration by inlier detection. Computer Vision and Image Understanding 224, 103556 (2022)

  24. [24]

    In: Proceedings of the AAAI (2020)

    He, H., Zhang, J., Zhang, Q., Tao, D.: Grapy-ml: Graph pyramid mutual learning for cross-dataset human parsing. In: Proceedings of the AAAI (2020)

  25. [25]

    Arxiv article (2021)

    Jia, Y., Kaul, C., Lawton, T., Murray-Smith, R., Habli, I.: Prediction of weaning from mechanical ventilation using convolutional neural networks. Arxiv article (2021)

  26. [26]

    IEEE Transactions on Image Processing 30, 832–844 (2021)

    Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hier- archical class activation maps. IEEE Transactions on Image Processing 30, 832–844 (2021)

  27. [27]

    Arxiv article (2024)

    Kim, C., Han, W., et.al: Eagle: Eigen aggregation learning for object-centric unsuper- vised semantic segmentation. Arxiv article (2024)

  28. [28]

    arXiv preprint (2025)

    Kim, D., Ko, H., et.al: Fourier decomposition for explicit representation of 3d point cloud attributes. arXiv preprint (2025)

  29. [29]

    Arxiv article (2021)

    Kaul, C., Mitton, J., et.al: Cpt: Convolutional point transformer for 3d point cloud processing. Arxiv article (2021)

  30. [30]

    In: Arxiv Article (2019)

    Kaul, C., Manandhar, S., Pears, N.: Focusnet: An attention-based fully convolutional network for medical image segmentation. In: Arxiv Article (2019)

  31. [31]

    Arxiv article (2019)

    Kaul, C., Pears, N., Manandhar, S.: Sawnet: A spatially aware deep neural network for. Arxiv article (2019)

  32. [32]

    In: Arxiv Article (2021)

    Kaul, C., Pears, N., Manandhar, S.: Fatnet: A feature-attentive network for 3d point cloud processing. In: Arxiv Article (2021)

  33. [33]

    Advances in Neural Information Processing Systems 35, 30499–30511 (2022)

    Kweon, H., Yoon, K.J.: Joint learning of 2d-3d weakly supervised semantic seg- mentation. Advances in Neural Information Processing Systems 35, 30499–30511 (2022)

  34. [34]

    In: Proceedings of the IEEE/CVF 10 Conference on Computer Vision and Pattern Recognition (2024)

    Kweon, H., Yoon, K.J.: From sam to cams: Exploring segment anything model for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF 10 Conference on Computer Vision and Pattern Recognition (2024)

  35. [35]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

    Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordi- nary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

  36. [36]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

  37. [37]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  38. [38]

    arXiv article (2018)

    Lu, Z., He, Q., et.al: Defect detection of pcb based on bayes feature fusion. arXiv article (2018)

  39. [39]

    Arxiv article (2023)

    Liu, X., Han, Z., Lee, S., Cao, Y.-P., Liu, Y.-S.: D-net: Learning for distinctive point clouds by self-attentive point searching and learnable feature fusion. Arxiv article (2023)

  40. [40]

    In: Arxiv Article (2019)

    Liu, X., Han, Z., Lee, S., Cao, Y.-P.: Point2sequence: Learning the shape representa- tion of 3d point clouds with an attention-based sequence to sequence network. In: Arxiv Article (2019)

  41. [41]

    Arxiv article (2021)

    Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. Arxiv article (2021)

  42. [42]

    In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

    Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation. In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

  43. [43]

    IEEE Transactions on Multimedia 25, 1686–1699 (2022)

    Li, J., Jie, Z., Wang, X., Zhou, Y., Wei, X., Ma, L.: Weakly supervised semantic segmentation via progressive patch learning. IEEE Transactions on Multimedia 25, 1686–1699 (2022)

  44. [44]

    Neurocomputing 561, 126821 (2023)

    Li, J., Jie, Z., Wang, X., Zhou, Y., Ma, L., Jiang, J.: Weakly supervised semantic segmentation via self-supervised destruction learning. Neurocomputing 561, 126821 (2023)

  45. [45]

    In: Arxiv Article (2023)

    Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., Deligianni, F.: Optimizing vision transformers for medical image segmentation. In: Arxiv Article (2023)

  46. [46]

    arXiv 11 article (2021)

    Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. arXiv 11 article (2021)

  47. [47]

    arXiv preprint (2021)

    Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. arXiv preprint (2021)

  48. [48]

    In: Arxiv Article (2022)

    Lin, L., Liu, Y., Hu, Y., Yan, X., Xie, K., Huang, H.: Capturing, reconstructing, and simulating: the urbanscene3d dataset. In: Arxiv Article (2022)

  49. [49]

    Arxiv article (2020)

    Lu, D., Lu, X., Sun, Y., Wang, J.: Deep feature-preserving normal estimation for point cloud filtering. Arxiv article (2020)

  50. [50]

    In: arXiv Preprint (2022)

    Lee, S.H., Oh, G., et.al: Sound-guided semantic video generation. In: arXiv Preprint (2022)

  51. [51]

    In: Proceedings of NeurIPS (2022)

    Liu, J., Pan, Z., He, H., Cai, J., Zhuang, B.: Ecoformer: Energy-saving attention with linear complexity. In: Proceedings of NeurIPS (2022)

  52. [52]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Lee, S.H., Roh, W., et.al: Sound-guided semantic image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  53. [53]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp

    Long, J., Shelhamer, E., Darrell, T.: Fully Convolutional Networks for Semantic Seg- mentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  54. [54]

    In: Proceedings of the 29th ACM International Conference on Multimedia, pp

    Li, J., Wang, W., Chen, J., Niu, L., Si, J., Qian, C., Zhang, L.: Video semantic segmentation via sparse temporal transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 59–68 (2021)

  55. [55]

    Applied Intelligence 53(18), 20753–20765 (2023)

    Li, X., Wu, Y., Dai, S.: Semi-supervised medical imaging segmentation with soft pseudo-label fusion. Applied Intelligence 53(18), 20753–20765 (2023)

  56. [56]

    In: arXiv Preprint (2022)

    Li, J., Wu, J., et.al: Partglee: A foundation model for recognizing and parsing any objects. In: arXiv Preprint (2022)

  57. [57]

    Li, K., Wang, Y., et.al: Uniformer: Unifying convolution and self-attention for visual recognition (2022)

  58. [58]

    Machine Intelligence Research (2023)

    Liu, Y., Wu, Y.H., et.al: Vision transformers with hierarchical attention. Machine Intelligence Research (2023)

  59. [59]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

    Liu, L., Wang, Z., Phan, M.H., Zhang, B., Ge, J., Liu, Y.: Bpkd: Boundary privileged knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

  60. [60]

    Arxiv article (2022)

    Lu, D., Xie, Q., et.al: 3dctn: 3d convolution-transformer network for point cloud classification. Arxiv article (2022)

  61. [61]

    arXiv preprint (2022)

    Lu, D., Xie, Q., Wei, M., Gao, K., Xu, L., Li, J.: Transformers in 3d point clouds: A 12 survey. arXiv preprint (2022)

  62. [62]

    Advances in Neural Information Processing Systems 34, 3978–3990 (2021)

    Liu, Y., Zhang, Z., Niu, L., Chen, J., Zhang, L.: Mixed supervised object detection by transferring mask prior and semantic similarity. Advances in Neural Information Processing Systems 34, 3978–3990 (2021)

  63. [63]

    Arxiv article (2023)

    Mukhoti, J., Lin, T.-Y., Poursaeed, O., Wang, R., Shah, A., Torr, P.H.S., Lim, S.-N.: Open vocabulary semantic segmentation with patch aligned contrastive learning. Arxiv article (2023)

  64. [64]

    In: Arrive Article (2021)

    Mommert, M., Scheibenreif, L., Hanna, J., Borth, D.: Power plant classification from remote imaging with deep learning. In: Arrive Article (2021)

  65. [65]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with localized style representations and factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2393–2402 (2021)

  66. [66]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Multiple heads are better than one: Few-shot font generation with multiple localized experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

  67. [67]

    IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with weakly supervised localized representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)

  68. [68]

    Arxiv article (2023)

    Pang, J., Liu, W., et.al: Mcnet: Magnitude consistency network for domain adaptive object detection under inclement environments. Arxiv article (2023)

  69. [69]

    In: Proceedings of the 27th ACM International Conference on Multimedia, pp

    Park, K., Woo, S., Kim, D., Cho, D., Kweon, I.S.: Preserving semantic and tempo- ral consistency for unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1248–1257 (2019)

  70. [70]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmenta- tion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  71. [71]

    Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)

    Park, K., Woo, S., Shin, I., Kweon, I.S.: Discover, hallucinate, and adapt: Open compound domain adaptation for semantic segmentation. Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)

  72. [72]

    In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp

    Palasek, P., Yang, H., Xu, Z., Hajimirza, N., Izquierdo, E., Patras, I.: A flexible cal- ibration method of multiple kinects for 3d human reconstruction. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4 (2015)

  73. [73]

    In: 2015 IEEE International Conference on 13 Image Processing (ICIP), pp

    Peng, Y.T., Zhao, X., Cosman, P.C.: Single underwater image enhancement using depth estimation based on blurriness. In: 2015 IEEE International Conference on 13 Image Processing (ICIP), pp. 4952–4956 (2015)

  74. [74]

    In: Proceedings of the AAAI (2022)

    Pan, Z., Zhuang, B., He, H., Liu, J., Cai, J.: Less is more: Pay less attention in vision transformers. In: Proceedings of the AAAI (2022)

  75. [75]

    In: Proceedings of the ICCV (2021)

    Pan, Z., Zhuang, B., Liu, J., He, H., Cai, J.: Scalable vision transformers with hierarchical pooling. In: Proceedings of the ICCV (2021)

  76. [76]

    In: arXiv Article (2021)

    Ranftl, R., Bochkovskiy, A., et.al: Vision transformers for dense prediction. In: arXiv Article (2021)

  77. [77]

    Arxiv article (2023)

    Riz, L., Saltori, C., Ricci, E., Poiesi, F.: Novel class discovery for 3d point cloud semantic segmentation. Arxiv article (2023)

  78. [78]

    Journal of Chemical Information and Modeling (2021)

    Sacha, M., B laz, M., Byrski, P., Dabrowski-Tumanski, P., Chrominski, M., et al.: Molecule edit graph attention network: Modeling chemical reactions as sequences of graph edits. Journal of Chemical Information and Modeling (2021)

  79. [79]

    In: Arrive Article (2021)

    Strudel, R., Garcia, R., et.al: Segmenter: Transformer for semantic segmentation. In: Arrive Article (2021)

  80. [80]

    In: Arrive Article (2022)

    Scheibenreif, L., Hanna, J., et.al: Self-supervised vision transformers for land-cover segmentation and classification. In: Arrive Article (2022)

Showing first 80 references.