pith. the verified trust layer for science. sign in

arxiv: 2508.06816 · v3 · submitted 2025-08-09 · 💻 cs.CV

DualResolution Residual Architecture with Artifact Suppression for Melanocytic Lesion Segmentation

Pith reviewed 2026-05-19 00:37 UTC · model grok-4.3

classification 💻 cs.CV
keywords melanocytic lesion segmentationdermoscopydual resolutionartifact suppressionboundary awaremulti-task learningskin cancerimage segmentation
0
0 comments X p. Extension

The pith

A dual-resolution residual architecture with artifact suppression produces more precise segmentation of melanocytic lesions in dermoscopic images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a dual-resolution residual network designed specifically for segmenting melanocytic tumors in dermoscopic images. It uses a high-resolution stream to maintain fine boundary details and a pooled stream to capture broader context, connected through boundary-aware residual links and channel attention. A lightweight artifact suppression block addresses issues like hairs and bubbles, while a multi-task training approach combines Dice-Tversky loss with boundary loss and contrastive regularization to handle small datasets. This design aims to deliver accurate masks without heavy post-processing. The authors show through evaluations on public benchmarks that it improves boundary precision and other metrics over standard encoder-decoder models, supporting better automated skin cancer screening.

Core claim

The dual-resolution residual architecture incorporates a high-resolution stream that preserves fine boundary details alongside a pooled stream for multi-scale context, integrated via boundary-aware residual connections and channel attention, together with a lightweight artifact suppression block and multi-task training using Dice-Tversky loss, explicit boundary loss, and contrastive regularizer, enabling the generation of pixel-accurate segmentation masks for melanocytic lesions without extensive post-processing or complex pre-training.

What carries the argument

Dual-resolution streams with boundary-aware residual connections and a lightweight artifact suppression block, trained via multi-task losses including Dice-Tversky, boundary, and contrastive terms.

If this is right

  • Enhances boundary precision and clinically relevant segmentation metrics on public dermoscopic benchmarks.
  • Outperforms traditional encoder-decoder baselines in lesion segmentation accuracy.
  • Generates pixel-accurate masks without the need for extensive post-processing or complex pre-training.
  • Provides a valuable component for building automated melanoma assessment systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be adapted for segmenting other types of skin lesions or medical images with similar artifact challenges.
  • Further validation on datasets representing more diverse skin tones and clinical settings would strengthen its applicability.
  • Combining this architecture with real-time inference optimizations might enable deployment in clinical decision support tools.
  • The contrastive regularizer may offer benefits in other segmentation tasks where feature stability is key on limited data.

Load-bearing premise

The public dermoscopic benchmarks used are representative of real-world clinical variability in artifacts, skin types, and lesion appearances.

What would settle it

A new evaluation on a clinical dataset with greater variability in skin types, lighting, or artifact types where the method fails to show improved boundary precision or segmentation metrics compared to baselines would challenge the claims.

read the original abstract

Lesion segmentation, in contrast to natural scene segmentation, requires handling subtle variations in texture and color, frequent imaging artifacts (such as hairs, rulers, and bubbles), and a critical need for precise boundary localization to aid in accurate diagnosis. The accurate delineation of melanocytic tumors in dermoscopic images is a crucial component of automated skin cancer screening systems and clinical decision support. In this paper, we present a novel dual-resolution architecture inspired by ResNet, specifically tailored for the segmentation of melanocytic tumors. Our approach incorporates a high-resolution stream that preserves fine boundary details, alongside a complementary pooled stream that captures multi-scale contextual information for robust lesion recognition. These two streams are closely integrated through boundary-aware residual connections, which inject edge information into deep feature maps, and a channel attention mechanism that adapts the model's sensitivity to color and texture variations in dermoscopic images. To tackle common imaging artifacts and the challenges posed by small clinical datasets, we introduce a lightweight artifact suppression block and a multi-task training strategy. This strategy combines the Dice-Tversky loss with an explicit boundary loss and a contrastive regularizer to enhance feature stability. This unified design enables the model to generate pixel-accurate segmentation masks without the need for extensive post-processing or complex pre-training. Extensive evaluation on public dermoscopic benchmarks reveals that our method significantly enhances boundary precision and clinically relevant segmentation metrics, outperforming traditional encoder-decoder baselines. This makes our approach a valuable component for building automated melanoma assessment systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes a DualResolution Residual Architecture for melanocytic lesion segmentation in dermoscopic images. It uses a high-resolution stream to preserve boundary details and a pooled stream for multi-scale context, integrated via boundary-aware residual connections and channel attention. A lightweight artifact suppression block addresses imaging artifacts, while multi-task training combines Dice-Tversky loss, an explicit boundary loss, and a contrastive regularizer. The central claim is that this design yields superior boundary precision and clinically relevant segmentation metrics on public dermoscopic benchmarks compared to traditional encoder-decoder baselines, without requiring extensive post-processing.

Significance. If the empirical results hold with proper validation, the work could offer a practical, lightweight contribution to automated skin cancer screening by improving handling of artifacts and subtle boundaries in dermoscopy. The multi-task strategy and avoidance of complex pre-training are pragmatic strengths for deployment on limited clinical data.

major comments (1)
  1. [Experimental Evaluation] Experimental Evaluation section: the headline claim that the method 'significantly enhances boundary precision and clinically relevant segmentation metrics' while outperforming baselines rests on public dermoscopic benchmarks; however, these benchmarks are not shown to capture sufficient variability in Fitzpatrick skin types, artifact distributions (hairs, rulers, bubbles), or subtle lesion boundaries typical of real clinical settings. Without cross-dataset generalization tests or ablations isolating the artifact suppression block under distribution shift, the robustness and clinical relevance of the gains are not fully supported.
minor comments (1)
  1. [Abstract] The abstract would be strengthened by including at least one or two key quantitative results (e.g., Dice or boundary F1 scores with baselines) rather than purely qualitative assertions of superiority.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comment below and outline revisions that will be incorporated to strengthen the presentation of robustness and clinical relevance.

read point-by-point responses
  1. Referee: [Experimental Evaluation] Experimental Evaluation section: the headline claim that the method 'significantly enhances boundary precision and clinically relevant segmentation metrics' while outperforming baselines rests on public dermoscopic benchmarks; however, these benchmarks are not shown to capture sufficient variability in Fitzpatrick skin types, artifact distributions (hairs, rulers, bubbles), or subtle lesion boundaries typical of real clinical settings. Without cross-dataset generalization tests or ablations isolating the artifact suppression block under distribution shift, the robustness and clinical relevance of the gains are not fully supported.

    Authors: We agree that explicit demonstration of generalization across greater clinical variability would strengthen the claims. While the public benchmarks (ISIC 2017/2018 and PH2) already contain images spanning multiple skin tones, common artifacts (hairs, rulers, bubbles), and lesions with ambiguous boundaries, we acknowledge that dedicated cross-dataset tests and isolated ablations of the artifact suppression block under distribution shift are not currently reported. In the revised manuscript we will add these experiments, including evaluation on an external dataset and controlled ablations that isolate the artifact block under simulated shifts in artifact prevalence and skin-tone distribution. These additions will be placed in the Experimental Evaluation section and will directly support the robustness assertions. revision: yes

Circularity Check

0 steps flagged

Empirical architecture proposal with no derivation chain or self-referential reductions

full rationale

The paper introduces a dual-resolution residual network with boundary-aware connections, channel attention, a lightweight artifact suppression block, and multi-task losses (Dice-Tversky + boundary + contrastive) for dermoscopic lesion segmentation. All claims rest on experimental comparisons against encoder-decoder baselines on public benchmarks, with no mathematical derivations, predictions, or uniqueness theorems that reduce by construction to fitted inputs, self-citations, or ansatzes. Design elements are presented as engineering choices evaluated empirically rather than derived from prior self-referential results. The work is self-contained against external benchmarks, producing a normal non-finding for circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim depends on empirical validation of the proposed components rather than first-principles derivation; standard neural network assumptions about data distribution and optimization are invoked implicitly.

free parameters (1)
  • multi-task loss weights
    The combination of Dice-Tversky, boundary, and contrastive terms implies tunable coefficients whose specific values are not stated and must be chosen to achieve the reported performance.
axioms (1)
  • domain assumption Dermoscopic images contain a limited set of common artifacts (hairs, rulers, bubbles) that can be effectively suppressed by a lightweight dedicated block without harming lesion features.
    Invoked in the description of the artifact suppression block and its role in handling small clinical datasets.

pith-pipeline@v0.9.0 · 5817 in / 1401 out tokens · 39052 ms · 2026-05-19T00:37:59.243270+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

167 extracted references · 167 canonical work pages · 2 internal anchors

  1. [1]

    Arxiv article (2023)

    Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., Marlet, R.: Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Arxiv article (2023)

  2. [2]

    Bazi, Y., Bashmal, L., et.al: Vision transformers for remote sensing image classification (2021)

  3. [3]

    Arxiv article (2024)

    Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuili` ere, S.: Collaborating foundation models for domain generalized semantic segmentation. Arxiv article (2024)

  4. [4]

    In: European Conference on Computer Vision (ECCV) (2020)

    Cha, J., Chun, S., Lee, G., Lee, B., Kim, S., Lee, H.: Few-shot compositional font gen- eration with dual memory. In: European Conference on Computer Vision (ECCV) (2020)

  5. [5]

    In: Arxiv Article (2021)

    Choromanski, K., Likhosherstov, V., et.al: Rethinking attention with performers. In: Arxiv Article (2021)

  6. [6]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open- world semantic segmentation from only image-text pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  7. [7]

    Arxiv article (2023) 11

    Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open-world semantic segmentation from only image-text pairs. Arxiv article (2023) 11

  8. [8]

    Advances in Neural Information Processing Systems 34, 7306–7318 (2021)

    Chen, J., Niu, L., Liu, L., Zhang, L.: Weak-shot fine-grained classification via similarity transfer. Advances in Neural Information Processing Systems 34, 7306–7318 (2021)

  9. [9]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Chen, J., Niu, L., Zhang, J., Si, J., Qian, C., Zhang, L.: Amodal instance segmentation via prior-guided expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 313–321 (2023)

  10. [10]

    Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

    Chen, J., Niu, L., Zhou, S., Si, J., Qian, C., Zhang, L.: Weak-shot semantic segmenta- tion via dual similarity transfer. Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

  11. [11]

    arXiv preprint (2022) arXiv:2203.11068

    Cun, X., Wang, Z., et.al: Learning enriched illuminants for cross and single sensor color constancy. Arxiv preprint (2022) Arxiv:2203.11068

  12. [12]

    Ding, L., Lin, D., et.al: Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images (2022)

  13. [13]

    Arxiv article (2025)

    Du, J., Liu, Y., et.al: Dependeval: Benchmarking llms for repository dependency understanding. Arxiv article (2025)

  14. [14]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp

    Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual Attention Network for Scene Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)

  15. [15]

    Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure

    Fan, D.P., Zhang, S.C., et.al: Face sketch synthesis style similarity: A new structure co-occurrence texture measure. Arxiv preprint (2018) Arxiv:1804.02975

  16. [16]

    In: Arxiv Article (2022)

    Guan, T., Wang, J., et.al: M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Arxiv Article (2022)

  17. [17]

    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

    Ghorbanzadeh, O., Xu, Y., Zhao, H., Wang, J., Zhong, Y., Zhao, D., Zang, Q., et al.: The outcome of the 2022 landslide4sense competition: Advanced landslide detection from multisource satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

  18. [18]

    Huang, Z., Ben, Y., et.al: Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer (2021)

  19. [19]

    Arxiv article (2023)

    He, H., Cai, J., Pan, Z., Liu, J., Zhang, J., Tao, D., Zhuang, B.: Dynamic focus-aware positional queries for semantic segmentation. Arxiv article (2023)

  20. [20]

    In: Proceedings of the ICCV, pp

    He, H., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Sensitivity-aware visual parameter- efficient fine-tuning. In: Proceedings of the ICCV, pp. 11825–11835 (2023)

  21. [21]

    IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022) 12

    He, P., Jiao, L., Shang, R., Wang, S., Liu, X., Quan, D., Yang, K., Zhao, D.: Manet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022) 12

  22. [22]

    TPAMI (2024)

    He, H., Liu, J., Pan, Z., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Pruning self-attentions into convolutional layers in single path. TPAMI (2024)

  23. [23]

    Computer Vision and Image Understanding 224, 103556 (2022)

    Huang, X., Wang, Y., Li, S., Mei, G., Xu, Z., Wang, Y., Zhang, J., Bennamoun, M.: Robust real-world point cloud registration by inlier detection. Computer Vision and Image Understanding 224, 103556 (2022)

  24. [24]

    In: Proceedings of the AAAI (2020)

    He, H., Zhang, J., Zhang, Q., Tao, D.: Grapy-ml: Graph pyramid mutual learning for cross-dataset human parsing. In: Proceedings of the AAAI (2020)

  25. [25]

    Arxiv article (2021)

    Jia, Y., Kaul, C., Lawton, T., Murray-Smith, R., Habli, I.: Prediction of weaning from mechanical ventilation using convolutional neural networks. Arxiv article (2021)

  26. [26]

    IEEE Transactions on Image Processing 30, 832–844 (2021)

    Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hier- archical class activation maps. IEEE Transactions on Image Processing 30, 832–844 (2021)

  27. [27]

    Arxiv article (2024)

    Kim, C., Han, W., et.al: Eagle: Eigen aggregation learning for object-centric unsuper- vised semantic segmentation. Arxiv article (2024)

  28. [28]

    Arxiv preprint (2025)

    Kim, D., Ko, H., et.al: Fourier decomposition for explicit representation of 3d point cloud attributes. Arxiv preprint (2025)

  29. [29]

    Arxiv article (2021)

    Kaul, C., Mitton, J., et.al: Cpt: Convolutional point transformer for 3d point cloud processing. Arxiv article (2021)

  30. [30]

    In: Arxiv Article (2019)

    Kaul, C., Manandhar, S., Pears, N.: Focusnet: An attention-based fully convolutional network for medical image segmentation. In: Arxiv Article (2019)

  31. [31]

    Arxiv article (2019)

    Kaul, C., Pears, N., Manandhar, S.: Sawnet: A spatially aware deep neural network for 3d point cloud processing. Arxiv article (2019)

  32. [32]

    In: Arxiv Article (2021)

    Kaul, C., Pears, N., Manandhar, S.: Fatnet: A feature-attentive network for 3d point cloud processing. In: Arxiv Article (2021)

  33. [33]

    Advances in Neural Information Processing Systems 35, 30499–30511 (2022)

    Kweon, H., Yoon, K.J.: Joint learning of 2d-3d weakly supervised semantic seg- mentation. Advances in Neural Information Processing Systems 35, 30499–30511 (2022)

  34. [34]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

    Kweon, H., Yoon, K.J.: From sam to cams: Exploring segment anything model for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

  35. [35]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021) 13

    Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordi- nary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021) 13

  36. [36]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

  37. [37]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  38. [38]

    Arxiv article (2018)

    Lu, Z., He, Q., et.al: Defect detection of pcb based on bayes feature fusion. Arxiv article (2018)

  39. [39]

    Arxiv article (2023)

    Liu, X., Han, Z., Lee, S., Cao, Y.-P., Liu, Y.-S.: D-net: Learning for distinctive point clouds by self-attentive point searching and learnable feature fusion. Arxiv article (2023)

  40. [40]

    In: Arxiv Article (2019)

    Liu, X., Han, Z., Lee, S., Cao, Y.-P.: Point2sequence: Learning the shape representa- tion of 3d point clouds with an attention-based sequence to sequence network. In: Arxiv Article (2019)

  41. [41]

    Arxiv article (2021)

    Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. Arxiv article (2021)

  42. [42]

    In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

    Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation. In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

  43. [43]

    IEEE Transactions on Multimedia 25, 1686–1699 (2022)

    Li, J., Jie, Z., Wang, X., Zhou, Y., Wei, X., Ma, L.: Weakly supervised semantic segmentation via progressive patch learning. IEEE Transactions on Multimedia 25, 1686–1699 (2022)

  44. [44]

    Neurocomputing 561, 126821 (2023)

    Li, J., Jie, Z., Wang, X., Zhou, Y., Ma, L., Jiang, J.: Weakly supervised semantic segmentation via self-supervised destruction learning. Neurocomputing 561, 126821 (2023)

  45. [45]

    In: Arxiv Article (2023)

    Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., Deligianni, F.: Optimizing vision transformers for medical image segmentation. In: Arxiv Article (2023)

  46. [46]

    Arxiv preprint (2021)

    Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. Arxiv preprint (2021)

  47. [47]

    In: Arxiv Article (2022)

    Lin, L., Liu, Y., Hu, Y., Yan, X., Xie, K., Huang, H.: Capturing, reconstructing, and simulating: the urbanscene3d dataset. In: Arxiv Article (2022)

  48. [48]

    Arxiv article (2020) 14

    Lu, D., Lu, X., Sun, Y., Wang, J.: Deep feature-preserving normal estimation for point cloud filtering. Arxiv article (2020) 14

  49. [49]

    In: Arxiv Preprint (2022)

    Lee, S.H., Oh, G., et.al: Sound-guided semantic video generation. In: Arxiv Preprint (2022)

  50. [50]

    In: Proceedings of NeurIPS (2022)

    Liu, J., Pan, Z., He, H., Cai, J., Zhuang, B.: Ecoformer: Energy-saving attention with linear complexity. In: Proceedings of NeurIPS (2022)

  51. [51]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Lee, S.H., Roh, W., et.al: Sound-guided semantic image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  52. [52]

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp

    Long, J., Shelhamer, E., Darrell, T.: Fully Convolutional Networks for Semantic Seg- mentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  53. [53]

    In: Proceedings of the 29th ACM International Conference on Multimedia, pp

    Li, J., Wang, W., Chen, J., Niu, L., Si, J., Qian, C., Zhang, L.: Video semantic segmentation via sparse temporal transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 59–68 (2021)

  54. [54]

    Applied Intelligence 53(18), 20753–20765 (2023)

    Li, X., Wu, Y., Dai, S.: Semi-supervised medical imaging segmentation with soft pseudo-label fusion. Applied Intelligence 53(18), 20753–20765 (2023)

  55. [55]

    In: Arxiv Preprint (2022)

    Li, J., Wu, J., et.al: Partglee: A foundation model for recognizing and parsing any objects. In: Arxiv Preprint (2022)

  56. [56]

    Li, K., Wang, Y., et.al: Uniformer: Unifying convolution and self-attention for visual recognition (2022)

  57. [57]

    Machine Intelligence Research (2023)

    Liu, Y., Wu, Y.H., et.al: Vision transformers with hierarchical attention. Machine Intelligence Research (2023)

  58. [58]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

    Liu, L., Wang, Z., Phan, M.H., Zhang, B., Ge, J., Liu, Y.: Bpkd: Boundary privileged knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

  59. [59]

    Arxiv article (2022)

    Lu, D., Xie, Q., et.al: 3dctn: 3d convolution-transformer network for point cloud classification. Arxiv article (2022)

  60. [60]

    Advances in Neural Information Processing Systems 34, 3978–3990 (2021)

    Liu, Y., Zhang, Z., Niu, L., Chen, J., Zhang, L.: Mixed supervised object detection by transferring mask prior and semantic similarity. Advances in Neural Information Processing Systems 34, 3978–3990 (2021)

  61. [61]

    Arxiv article (2023)

    Mukhoti, J., Lin, T.-Y., Poursaeed, O., Wang, R., Shah, A., Torr, P.H.S., Lim, S.-N.: Open vocabulary semantic segmentation with patch aligned contrastive learning. Arxiv article (2023)

  62. [62]

    In: Arxiv Article (2021)

    Mommert, M., Scheibenreif, L., Hanna, J., Borth, D.: Power plant classification from remote imaging with deep learning. In: Arxiv Article (2021)

  63. [63]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with localized 15 style representations and factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2393–2402 (2021)

  64. [64]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Multiple heads are better than one: Few-shot font generation with multiple localized experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

  65. [65]

    IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with weakly supervised localized representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)

  66. [66]

    Arxiv article (2023)

    Pang, J., Liu, W., et.al: Mcnet: Magnitude consistency network for domain adaptive object detection under inclement environments. Arxiv article (2023)

  67. [67]

    In: Proceedings of the 27th ACM International Conference on Multimedia, pp

    Park, K., Woo, S., Kim, D., Cho, D., Kweon, I.S.: Preserving semantic and tempo- ral consistency for unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1248–1257 (2019)

  68. [68]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmenta- tion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  69. [69]

    Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)

    Park, K., Woo, S., Shin, I., Kweon, I.S.: Discover, hallucinate, and adapt: Open compound domain adaptation for semantic segmentation. Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)

  70. [70]

    In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp

    Palasek, P., Yang, H., Xu, Z., Hajimirza, N., Izquierdo, E., Patras, I.: A flexible cal- ibration method of multiple kinects for 3d human reconstruction. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4 (2015)

  71. [71]

    In: 2015 IEEE International Conference on Image Processing (ICIP), pp

    Peng, Y.T., Zhao, X., Cosman, P.C.: Single underwater image enhancement using depth estimation based on blurriness. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4952–4956 (2015)

  72. [72]

    In: Proceedings of the AAAI (2022)

    Pan, Z., Zhuang, B., He, H., Liu, J., Cai, J.: Less is more: Pay less attention in vision transformers. In: Proceedings of the AAAI (2022)

  73. [73]

    In: Proceedings of the ICCV (2021)

    Pan, Z., Zhuang, B., Liu, J., He, H., Cai, J.: Scalable vision transformers with hierarchical pooling. In: Proceedings of the ICCV (2021)

  74. [74]

    In: Arxiv Article (2021)

    Ranftl, R., Bochkovskiy, A., et.al: Vision transformers for dense prediction. In: Arxiv Article (2021)

  75. [75]

    Arxiv article (2023)

    Riz, L., Saltori, C., Ricci, E., Poiesi, F.: Novel class discovery for 3d point cloud semantic segmentation. Arxiv article (2023)

  76. [76]

    Journal of Chemical Information and Modeling (2021)

    Sacha, M., B laz, M., Byrski, P., Dabrowski-Tumanski, P., Chrominski, M., et al.: 16 Molecule edit graph attention network: Modeling chemical reactions as sequences of graph edits. Journal of Chemical Information and Modeling (2021)

  77. [77]

    In: Arxiv Article (2021)

    Strudel, R., Garcia, R., et.al: Segmenter: Transformer for semantic segmentation. In: Arxiv Article (2021)

  78. [78]

    In: Arxiv Article (2022)

    Scheibenreif, L., Hanna, J., et.al: Self-supervised vision transformers for land-cover segmentation and classification. In: Arxiv Article (2022)

  79. [79]

    In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

    Sacha, M., Jura, B., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Inter- pretability benchmark for evaluating spatial misalignment of prototypical parts explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

  80. [80]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

    Shin, I., Kim, D.J., Cho, J.W., Woo, S., Park, K., Kweon, I.S.: Labor: Labeling only if required for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

Showing first 80 references.