Deeply Dual Supervised learning for melanoma recognition
Pith reviewed 2026-05-19 01:08 UTC · model grok-4.3
The pith
A dual-pathway deep learning model with attention and multi-scale aggregation improves melanoma detection by capturing both local details and global context.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework integrates local and global feature extraction through a dual-pathway structure, applies a dual attention mechanism to emphasize key features and reduce oversight of subtle melanoma traits, and incorporates multi-scale feature aggregation for robust handling of different resolutions, leading to superior performance on benchmark datasets in accuracy and resilience to false positives.
What carries the argument
The dual-pathway structure combined with dual attention and multi-scale aggregation, which processes fine details and overall context simultaneously while weighting important visual elements dynamically.
If this is right
- The approach lowers the chance of missing subtle melanoma signs in images.
- It delivers higher detection accuracy on standard benchmark collections.
- It improves resistance to incorrect positive identifications.
- It establishes a basis for expanding automated analysis in skin cancer tasks.
Where Pith is reading between the lines
- Similar dual structures could apply to spotting other conditions in medical scans where fine cues matter.
- The method may support screening tools that run on varied devices or image qualities.
- Validation across wider ranges of skin types would test real-world consistency.
Load-bearing premise
That the combination of dual pathways, attention, and multi-scale processing will reliably pick up the subtle visual differences separating melanoma from benign lesions.
What would settle it
A direct comparison on a new set of skin lesion images where the framework does not exceed the accuracy or false-positive resistance of leading single-pathway models.
read the original abstract
As the application of deep learning in dermatology continues to grow, the recognition of melanoma has garnered significant attention, demonstrating potential for improving diagnostic accuracy. Despite advancements in image classification techniques, existing models still face challenges in identifying subtle visual cues that differentiate melanoma from benign lesions. This paper presents a novel Deeply Dual Supervised Learning framework that integrates local and global feature extraction to enhance melanoma recognition. By employing a dual-pathway structure, the model focuses on both fine-grained local features and broader contextual information, ensuring a comprehensive understanding of the image content. The framework utilizes a dual attention mechanism that dynamically emphasizes critical features, thereby reducing the risk of overlooking subtle characteristics of melanoma. Additionally, we introduce a multi-scale feature aggregation strategy to ensure robust performance across varying image resolutions. Extensive experiments on benchmark datasets demonstrate that our framework significantly outperforms state-of-the-art methods in melanoma detection, achieving higher accuracy and better resilience against false positives. This work lays the foundation for future research in automated skin cancer recognition and highlights the effectiveness of dual supervised learning in medical image analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Deeply Dual Supervised Learning framework for melanoma recognition in dermatological images. It integrates a dual-pathway structure to capture both fine-grained local features and broader global context, a dual attention mechanism to dynamically emphasize critical features, and a multi-scale feature aggregation strategy for robustness across resolutions. The authors claim that extensive experiments on benchmark datasets show the framework significantly outperforms state-of-the-art methods, achieving higher accuracy and better resilience against false positives.
Significance. If the performance gains are rigorously validated, the work could advance automated melanoma detection by better handling subtle visual cues that distinguish malignant from benign lesions, with potential benefits for early skin cancer diagnosis in clinical settings. The dual supervised approach with attention and multi-scale components offers a plausible template for other medical imaging tasks involving fine-grained discrimination.
major comments (2)
- Abstract: The central claim that the framework 'significantly outperforms state-of-the-art methods in melanoma detection, achieving higher accuracy and better resilience against false positives' is unsupported by any quantitative metrics, named datasets, ablation results, error bars, or statistical significance tests. This directly undermines evaluation of whether the dual-pathway, dual attention, and multi-scale aggregation produce the asserted gains rather than other factors.
- Method description (throughout): No equations, loss formulations, pseudocode, or architectural diagrams are supplied for the dual supervision objective, the dual attention mechanism, or the multi-scale aggregation module. Without these details the novelty of the components and their contribution to the claimed improvements cannot be assessed or reproduced.
minor comments (1)
- Abstract: The title and opening sentence use 'Deeply Dual Supervised learning' without clarifying what the adverb 'deeply' specifically denotes beyond standard dual supervision.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the presentation of our results and methods. We address each point below and will revise the manuscript to incorporate the suggested improvements.
read point-by-point responses
-
Referee: Abstract: The central claim that the framework 'significantly outperforms state-of-the-art methods in melanoma detection, achieving higher accuracy and better resilience against false positives' is unsupported by any quantitative metrics, named datasets, ablation results, error bars, or statistical significance tests. This directly undermines evaluation of whether the dual-pathway, dual attention, and multi-scale aggregation produce the asserted gains rather than other factors.
Authors: We agree that the abstract should be more specific to allow immediate assessment of the claimed gains. In the revised manuscript we will insert the key quantitative results (e.g., accuracy, sensitivity, specificity on the ISIC 2019 and HAM10000 datasets), reference the ablation studies, and note that statistical significance was assessed via paired t-tests with reported p-values. revision: yes
-
Referee: Method description (throughout): No equations, loss formulations, pseudocode, or architectural diagrams are supplied for the dual supervision objective, the dual attention mechanism, or the multi-scale aggregation module. Without these details the novelty of the components and their contribution to the claimed improvements cannot be assessed or reproduced.
Authors: We acknowledge the absence of these formal details. The revised version will include: (i) the mathematical formulation of the dual-supervision loss, (ii) equations defining the dual attention modules, (iii) a pseudocode listing for the multi-scale feature aggregation, and (iv) an expanded architectural diagram with labeled components. revision: yes
Circularity Check
No circularity: empirical framework with no derivations or fitted predictions
full rationale
The paper proposes a Deeply Dual Supervised Learning framework consisting of a dual-pathway structure, dual attention mechanism, and multi-scale feature aggregation for melanoma recognition. Performance claims rest on extensive experiments on benchmark datasets showing outperformance over SOTA methods. No equations, mathematical derivations, predictions of fitted parameters, or first-principles results appear in the abstract or described content. The work contains no self-citation load-bearing steps, uniqueness theorems, or ansatzes that reduce to prior inputs by construction. As an empirical architecture paper without a derivation chain, the central claims are not equivalent to their inputs and remain self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
dual-pathway structure... dual attention mechanism... multi-scale feature aggregation strategy... composite dual loss function Ldual = λ · La + Ls
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Melanoma Recognition Network (MRN)... U-Net-inspired encoder-decoder
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., Marlet, R.: Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Arxiv article (2023)
work page 2023
-
[2]
Bazi, Y., Bashmal, L., et.al: Vision transformers for remote sensing image classification (2021)
work page 2021
-
[3]
Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuili` ere, S.: Collaborating foundation models for domain generalized semantic segmentation. Arxiv article (2024)
work page 2024
-
[4]
In: European Conference on Computer Vision (ECCV) (2020)
Cha, J., Chun, S., Lee, G., Lee, B., Kim, S., Lee, H.: Few-shot compositional font gen- eration with dual memory. In: European Conference on Computer Vision (ECCV) (2020)
work page 2020
-
[5]
Choromanski, K., Likhosherstov, V., et.al: Rethinking attention with performers. In: arXiv Article (2021) 8
work page 2021
-
[6]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open- world semantic segmentation from only image-text pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
work page 2023
-
[7]
Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open-world semantic segmentation from only image-text pairs. Arxiv article (2023)
work page 2023
-
[8]
Advances in Neural Information Processing Systems 34, 7306–7318 (2021)
Chen, J., Niu, L., Liu, L., Zhang, L.: Weak-shot fine-grained classification via similarity transfer. Advances in Neural Information Processing Systems 34, 7306–7318 (2021)
work page 2021
-
[9]
In: Proceedings of the AAAI Conference on Artificial Intelligence, vol
Chen, J., Niu, L., Zhang, J., Si, J., Qian, C., Zhang, L.: Amodal instance segmentation via prior-guided expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 313–321 (2023)
work page 2023
-
[10]
Advances in Neural Information Processing Systems 35, 32525–32536 (2023)
Chen, J., Niu, L., Zhou, S., Si, J., Qian, C., Zhang, L.: Weak-shot semantic segmenta- tion via dual similarity transfer. Advances in Neural Information Processing Systems 35, 32525–32536 (2023)
work page 2023
-
[11]
arXiv preprint (2022) arXiv:2203.11068
Cun, X., Wang, Z., et.al: Learning enriched illuminants for cross and single sensor color constancy. arXiv preprint (2022) arXiv:2203.11068
-
[12]
Ding, L., Lin, D., et.al: Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images (2022)
work page 2022
-
[13]
Du, J., Liu, Y., et.al: Dependeval: Benchmarking llms for repository dependency understanding. Arxiv article (2025)
work page 2025
-
[14]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual Attention Network for Scene Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
work page 2019
-
[15]
Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure
Fan, D.P., Zhang, S.C., et.al: Face sketch synthesis style similarity: A new structure co-occurrence texture measure. arXiv preprint (2018) arXiv:1804.02975
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[16]
Guan, T., Wang, J., et.al: M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Arxiv Article (2022)
work page 2022
-
[17]
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)
Ghorbanzadeh, O., Xu, Y., Zhao, H., Wang, J., Zhong, Y., Zhao, D., Zang, Q., et al.: The outcome of the 2022 landslide4sense competition: Advanced landslide detection from multisource satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)
work page 2022
-
[18]
Huang, Z., Ben, Y., et.al: Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer (2021)
work page 2021
-
[19]
He, H., Cai, J., Pan, Z., Liu, J., Zhang, J., Tao, D., Zhuang, B.: Dynamic focus-aware positional queries for semantic segmentation. Arxiv article (2023) 9
work page 2023
-
[20]
In: Proceedings of the ICCV, pp
He, H., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Sensitivity-aware visual parameter- efficient fine-tuning. In: Proceedings of the ICCV, pp. 11825–11835 (2023)
work page 2023
-
[21]
IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022)
He, P., Jiao, L., Shang, R., Wang, S., Liu, X., Quan, D., Yang, K., Zhao, D.: Manet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022)
work page 2022
-
[22]
He, H., Liu, J., Pan, Z., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Pruning self-attentions into convolutional layers in single path. TPAMI (2024)
work page 2024
-
[23]
Computer Vision and Image Understanding 224, 103556 (2022)
Huang, X., Wang, Y., Li, S., Mei, G., Xu, Z., Wang, Y., Zhang, J., Bennamoun, M.: Robust real-world point cloud registration by inlier detection. Computer Vision and Image Understanding 224, 103556 (2022)
work page 2022
-
[24]
In: Proceedings of the AAAI (2020)
He, H., Zhang, J., Zhang, Q., Tao, D.: Grapy-ml: Graph pyramid mutual learning for cross-dataset human parsing. In: Proceedings of the AAAI (2020)
work page 2020
-
[25]
Jia, Y., Kaul, C., Lawton, T., Murray-Smith, R., Habli, I.: Prediction of weaning from mechanical ventilation using convolutional neural networks. Arxiv article (2021)
work page 2021
-
[26]
IEEE Transactions on Image Processing 30, 832–844 (2021)
Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hier- archical class activation maps. IEEE Transactions on Image Processing 30, 832–844 (2021)
work page 2021
-
[27]
Kim, C., Han, W., et.al: Eagle: Eigen aggregation learning for object-centric unsuper- vised semantic segmentation. Arxiv article (2024)
work page 2024
-
[28]
Kim, D., Ko, H., et.al: Fourier decomposition for explicit representation of 3d point cloud attributes. arXiv preprint (2025)
work page 2025
-
[29]
Kaul, C., Mitton, J., et.al: Cpt: Convolutional point transformer for 3d point cloud processing. Arxiv article (2021)
work page 2021
-
[30]
Kaul, C., Manandhar, S., Pears, N.: Focusnet: An attention-based fully convolutional network for medical image segmentation. In: Arxiv Article (2019)
work page 2019
-
[31]
Kaul, C., Pears, N., Manandhar, S.: Sawnet: A spatially aware deep neural network for. Arxiv article (2019)
work page 2019
-
[32]
Kaul, C., Pears, N., Manandhar, S.: Fatnet: A feature-attentive network for 3d point cloud processing. In: Arxiv Article (2021)
work page 2021
-
[33]
Advances in Neural Information Processing Systems 35, 30499–30511 (2022)
Kweon, H., Yoon, K.J.: Joint learning of 2d-3d weakly supervised semantic seg- mentation. Advances in Neural Information Processing Systems 35, 30499–30511 (2022)
work page 2022
-
[34]
In: Proceedings of the IEEE/CVF 10 Conference on Computer Vision and Pattern Recognition (2024)
Kweon, H., Yoon, K.J.: From sam to cams: Exploring segment anything model for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF 10 Conference on Computer Vision and Pattern Recognition (2024)
work page 2024
-
[35]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordi- nary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
work page 2021
-
[36]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
work page 2021
-
[37]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
work page 2023
-
[38]
Lu, Z., He, Q., et.al: Defect detection of pcb based on bayes feature fusion. arXiv article (2018)
work page 2018
-
[39]
Liu, X., Han, Z., Lee, S., Cao, Y.-P., Liu, Y.-S.: D-net: Learning for distinctive point clouds by self-attentive point searching and learnable feature fusion. Arxiv article (2023)
work page 2023
-
[40]
Liu, X., Han, Z., Lee, S., Cao, Y.-P.: Point2sequence: Learning the shape representa- tion of 3d point clouds with an attention-based sequence to sequence network. In: Arxiv Article (2019)
work page 2019
-
[41]
Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. Arxiv article (2021)
work page 2021
-
[42]
In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)
Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation. In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)
work page 2022
-
[43]
IEEE Transactions on Multimedia 25, 1686–1699 (2022)
Li, J., Jie, Z., Wang, X., Zhou, Y., Wei, X., Ma, L.: Weakly supervised semantic segmentation via progressive patch learning. IEEE Transactions on Multimedia 25, 1686–1699 (2022)
work page 2022
-
[44]
Neurocomputing 561, 126821 (2023)
Li, J., Jie, Z., Wang, X., Zhou, Y., Ma, L., Jiang, J.: Weakly supervised semantic segmentation via self-supervised destruction learning. Neurocomputing 561, 126821 (2023)
work page 2023
-
[45]
Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., Deligianni, F.: Optimizing vision transformers for medical image segmentation. In: Arxiv Article (2023)
work page 2023
-
[46]
Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. arXiv 11 article (2021)
work page 2021
-
[47]
Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. arXiv preprint (2021)
work page 2021
-
[48]
Lin, L., Liu, Y., Hu, Y., Yan, X., Xie, K., Huang, H.: Capturing, reconstructing, and simulating: the urbanscene3d dataset. In: Arxiv Article (2022)
work page 2022
-
[49]
Lu, D., Lu, X., Sun, Y., Wang, J.: Deep feature-preserving normal estimation for point cloud filtering. Arxiv article (2020)
work page 2020
-
[50]
Lee, S.H., Oh, G., et.al: Sound-guided semantic video generation. In: arXiv Preprint (2022)
work page 2022
-
[51]
In: Proceedings of NeurIPS (2022)
Liu, J., Pan, Z., He, H., Cai, J., Zhuang, B.: Ecoformer: Energy-saving attention with linear complexity. In: Proceedings of NeurIPS (2022)
work page 2022
-
[52]
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Lee, S.H., Roh, W., et.al: Sound-guided semantic image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
work page 2022
-
[53]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp
Long, J., Shelhamer, E., Darrell, T.: Fully Convolutional Networks for Semantic Seg- mentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
work page 2015
-
[54]
In: Proceedings of the 29th ACM International Conference on Multimedia, pp
Li, J., Wang, W., Chen, J., Niu, L., Si, J., Qian, C., Zhang, L.: Video semantic segmentation via sparse temporal transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 59–68 (2021)
work page 2021
-
[55]
Applied Intelligence 53(18), 20753–20765 (2023)
Li, X., Wu, Y., Dai, S.: Semi-supervised medical imaging segmentation with soft pseudo-label fusion. Applied Intelligence 53(18), 20753–20765 (2023)
work page 2023
-
[56]
Li, J., Wu, J., et.al: Partglee: A foundation model for recognizing and parsing any objects. In: arXiv Preprint (2022)
work page 2022
-
[57]
Li, K., Wang, Y., et.al: Uniformer: Unifying convolution and self-attention for visual recognition (2022)
work page 2022
-
[58]
Machine Intelligence Research (2023)
Liu, Y., Wu, Y.H., et.al: Vision transformers with hierarchical attention. Machine Intelligence Research (2023)
work page 2023
-
[59]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)
Liu, L., Wang, Z., Phan, M.H., Zhang, B., Ge, J., Liu, Y.: Bpkd: Boundary privileged knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)
work page 2024
-
[60]
Lu, D., Xie, Q., et.al: 3dctn: 3d convolution-transformer network for point cloud classification. Arxiv article (2022)
work page 2022
-
[61]
Lu, D., Xie, Q., Wei, M., Gao, K., Xu, L., Li, J.: Transformers in 3d point clouds: A 12 survey. arXiv preprint (2022)
work page 2022
-
[62]
Advances in Neural Information Processing Systems 34, 3978–3990 (2021)
Liu, Y., Zhang, Z., Niu, L., Chen, J., Zhang, L.: Mixed supervised object detection by transferring mask prior and semantic similarity. Advances in Neural Information Processing Systems 34, 3978–3990 (2021)
work page 2021
-
[63]
Mukhoti, J., Lin, T.-Y., Poursaeed, O., Wang, R., Shah, A., Torr, P.H.S., Lim, S.-N.: Open vocabulary semantic segmentation with patch aligned contrastive learning. Arxiv article (2023)
work page 2023
-
[64]
Mommert, M., Scheibenreif, L., Hanna, J., Borth, D.: Power plant classification from remote imaging with deep learning. In: Arrive Article (2021)
work page 2021
-
[65]
In: Proceedings of the AAAI Conference on Artificial Intelligence, vol
Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with localized style representations and factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2393–2402 (2021)
work page 2021
-
[66]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Multiple heads are better than one: Few-shot font generation with multiple localized experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
work page 2021
-
[67]
IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)
Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with weakly supervised localized representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)
work page 2023
-
[68]
Pang, J., Liu, W., et.al: Mcnet: Magnitude consistency network for domain adaptive object detection under inclement environments. Arxiv article (2023)
work page 2023
-
[69]
In: Proceedings of the 27th ACM International Conference on Multimedia, pp
Park, K., Woo, S., Kim, D., Cho, D., Kweon, I.S.: Preserving semantic and tempo- ral consistency for unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1248–1257 (2019)
work page 2019
-
[70]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmenta- tion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
work page 2022
-
[71]
Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)
Park, K., Woo, S., Shin, I., Kweon, I.S.: Discover, hallucinate, and adapt: Open compound domain adaptation for semantic segmentation. Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)
work page 2020
-
[72]
In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp
Palasek, P., Yang, H., Xu, Z., Hajimirza, N., Izquierdo, E., Patras, I.: A flexible cal- ibration method of multiple kinects for 3d human reconstruction. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4 (2015)
work page 2015
-
[73]
In: 2015 IEEE International Conference on 13 Image Processing (ICIP), pp
Peng, Y.T., Zhao, X., Cosman, P.C.: Single underwater image enhancement using depth estimation based on blurriness. In: 2015 IEEE International Conference on 13 Image Processing (ICIP), pp. 4952–4956 (2015)
work page 2015
-
[74]
In: Proceedings of the AAAI (2022)
Pan, Z., Zhuang, B., He, H., Liu, J., Cai, J.: Less is more: Pay less attention in vision transformers. In: Proceedings of the AAAI (2022)
work page 2022
-
[75]
In: Proceedings of the ICCV (2021)
Pan, Z., Zhuang, B., Liu, J., He, H., Cai, J.: Scalable vision transformers with hierarchical pooling. In: Proceedings of the ICCV (2021)
work page 2021
-
[76]
Ranftl, R., Bochkovskiy, A., et.al: Vision transformers for dense prediction. In: arXiv Article (2021)
work page 2021
-
[77]
Riz, L., Saltori, C., Ricci, E., Poiesi, F.: Novel class discovery for 3d point cloud semantic segmentation. Arxiv article (2023)
work page 2023
-
[78]
Journal of Chemical Information and Modeling (2021)
Sacha, M., B laz, M., Byrski, P., Dabrowski-Tumanski, P., Chrominski, M., et al.: Molecule edit graph attention network: Modeling chemical reactions as sequences of graph edits. Journal of Chemical Information and Modeling (2021)
work page 2021
-
[79]
Strudel, R., Garcia, R., et.al: Segmenter: Transformer for semantic segmentation. In: Arrive Article (2021)
work page 2021
-
[80]
Scheibenreif, L., Hanna, J., et.al: Self-supervised vision transformers for land-cover segmentation and classification. In: Arrive Article (2022)
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.