Edge Detection for Organ Boundaries via Top Down Refinement and SubPixel Upsampling
Pith reviewed 2026-05-19 00:42 UTC · model grok-4.3
The pith
A top-down backward refinement architecture with subpixel upsampling produces millimeter-accurate organ boundaries in CT and MRI scans.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that adapting a top-down backward refinement architecture to medical images, by progressively upsampling high-level semantic features and fusing them with fine-grained low-level cues through a dedicated pathway, produces high-resolution crisp organ boundaries in 2D slices and anisotropic volumes, outperforming baseline ConvNet detectors and other medical edge methods on strict boundary F-measure and Hausdorff distance while also lifting performance in downstream segmentation, registration, and lesion delineation tasks.
What carries the argument
The top-down backward refinement pathway that progressively upsamples and fuses high-level semantic features with low-level cues, extended by light 3D context aggregation for volumes.
If this is right
- Substantially higher boundary F-measure and lower Hausdorff distance on several CT and MRI organ datasets.
- Consistent gains in organ segmentation, shown by higher Dice scores and reduced boundary errors.
- More accurate image registration when crisp edges are supplied.
- Better delineation of lesions located near organ interfaces.
Where Pith is reading between the lines
- The same refinement idea could be tested on other boundary-critical medical tasks such as vessel or tumor margin detection without changing the core fusion logic.
- Because the method already mixes 2D slice processing with minimal 3D context, it may scale to full 3D networks if memory allows while preserving the reported efficiency.
- Feeding these edges into interactive annotation tools might reduce the number of manual corrections needed at organ borders.
Load-bearing premise
That fusing high-level semantic features with low-level cues through backward refinement will reliably deliver millimeter-level boundary accuracy on medical images without introducing artifacts or needing extensive per-dataset tuning.
What would settle it
Apply the method to a new multi-center CT or MRI dataset with unseen scanner protocols and noise levels; if boundary F-measure and Hausdorff distance do not improve over the same baselines, the central claim does not hold.
read the original abstract
Accurate localization of organ boundaries is critical in medical imaging for segmentation, registration, surgical planning, and radiotherapy. While deep convolutional networks (ConvNets) have advanced general-purpose edge detection to near-human performance on natural images, their outputs often lack precise localization, a limitation that is particularly harmful in medical applications where millimeter-level accuracy is required. Building on a systematic analysis of ConvNet edge outputs, we propose a medically focused crisp edge detector that adapts a novel top-down backward refinement architecture to medical images (2D and volumetric). Our method progressively upsamples and fuses high-level semantic features with fine-grained low-level cues through a backward refinement pathway, producing high-resolution, well-localized organ boundaries. We further extend the design to handle anisotropic volumes by combining 2D slice-wise refinement with light 3D context aggregation to retain computational efficiency. Evaluations on several CT and MRI organ datasets demonstrate substantially improved boundary localization under strict criteria (boundary F-measure, Hausdorff distance) compared to baseline ConvNet detectors and contemporary medical edge/contour methods. Importantly, integrating our crisp edge maps into downstream pipelines yields consistent gains in organ segmentation (higher Dice scores, lower boundary errors), more accurate image registration, and improved delineation of lesions near organ interfaces. The proposed approach produces clinically valuable, crisp organ edges that materially enhance common medical-imaging tasks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a top-down backward refinement architecture with subpixel upsampling for crisp organ boundary detection in 2D and volumetric medical CT/MRI images. It progressively fuses high-level semantic features with low-level cues via a backward pathway, extends the design to anisotropic volumes using slice-wise 2D refinement plus light 3D aggregation, and claims superior boundary localization (F-measure, Hausdorff distance) over ConvNet baselines and medical edge methods, plus gains when the edges are fed into downstream segmentation, registration, and lesion delineation pipelines.
Significance. If the empirical improvements hold under rigorous evaluation, the method could offer a practical advance for millimeter-level boundary accuracy in clinical workflows where precise organ interfaces matter for segmentation, registration, and radiotherapy. The efficiency-focused 3D extension and emphasis on medical-specific challenges (anisotropy, low contrast) are positive aspects.
major comments (2)
- [§4] §4 (Experiments) and associated tables: the abstract and §1 assert substantially improved boundary F-measure and Hausdorff distance plus downstream Dice gains, yet no numerical tables, dataset sizes, error bars, cross-validation details, or ablation results are provided. This directly undermines verification of the central empirical claim.
- [§3] §3 (Method, backward refinement pathway): the description of progressive upsampling and high-to-low feature fusion does not include analysis or controls for artifact introduction in low-contrast or partial-volume regions typical of CT/MRI, nor evidence that millimeter accuracy is achieved without per-dataset tuning. This is load-bearing for the generalization claim.
minor comments (2)
- [Abstract] Abstract: specify the exact CT and MRI organ datasets used and their key characteristics (resolution, anisotropy, number of cases).
- [§4] Figure captions and §4: ensure all boundary metric plots include baseline comparisons with the same strict criteria (e.g., tolerance thresholds for F-measure).
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving clarity and rigor, particularly around experimental reporting and methodological robustness. We address each major comment below and will revise the manuscript accordingly to strengthen the presentation of our results and analysis.
read point-by-point responses
-
Referee: [§4] §4 (Experiments) and associated tables: the abstract and §1 assert substantially improved boundary F-measure and Hausdorff distance plus downstream Dice gains, yet no numerical tables, dataset sizes, error bars, cross-validation details, or ablation results are provided. This directly undermines verification of the central empirical claim.
Authors: We agree that the experimental details must be presented more explicitly to enable full verification of the claims. The complete manuscript includes results on multiple CT and MRI organ datasets with boundary F-measure, Hausdorff distance, and downstream segmentation/registration metrics, but we acknowledge these may not have been sufficiently highlighted or tabulated in the reviewed version. In the revision, we will expand §4 with comprehensive tables reporting all quantitative results, dataset sizes and compositions, standard deviations from cross-validation, and ablation studies on the top-down refinement and subpixel upsampling components. We will also add explicit cross-references from the abstract and §1 to these tables. revision: yes
-
Referee: [§3] §3 (Method, backward refinement pathway): the description of progressive upsampling and high-to-low feature fusion does not include analysis or controls for artifact introduction in low-contrast or partial-volume regions typical of CT/MRI, nor evidence that millimeter accuracy is achieved without per-dataset tuning. This is load-bearing for the generalization claim.
Authors: We recognize the importance of addressing potential artifacts and generalization explicitly for medical images. While the method is designed to mitigate issues in low-contrast areas through progressive high-to-low fusion and subpixel upsampling, we will revise §3 to include a new analysis subsection. This will provide qualitative and quantitative controls (e.g., edge maps and error metrics in partial-volume regions), discuss design elements that reduce artifact risk without per-dataset hyperparameter tuning, and reference cross-dataset results demonstrating consistent millimeter-level boundary accuracy. These additions will better support the generalization claims. revision: yes
Circularity Check
No circularity: empirical validation of refinement architecture
full rationale
The paper proposes a top-down backward refinement pathway with progressive upsampling and feature fusion for organ boundary edge detection in CT/MRI, extended to anisotropic volumes. Central claims rest on empirical evaluations using boundary F-measure, Hausdorff distance, and downstream gains in segmentation/registration on multiple datasets. No equations, fitted parameters renamed as predictions, or self-citation chains reduce any result to its inputs by construction. The method adapts ConvNet ideas with novel fusion but is self-contained against external benchmarks via reported metrics.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., Marlet, R.: Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Arxiv article (2023)
work page 2023
-
[2]
Bazi, Y., Bashmal, L., et.al: Vision transformers for remote sensing image classification (2021)
work page 2021
-
[3]
Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuili` ere, S.: Collaborating foundation models for domain generalized semantic segmentation. Arxiv article (2024)
work page 2024
-
[4]
In: European Conference on Computer Vision (ECCV) (2020)
Cha, J., Chun, S., Lee, G., Lee, B., Kim, S., Lee, H.: Few-shot compositional font gen- eration with dual memory. In: European Conference on Computer Vision (ECCV) (2020)
work page 2020
-
[5]
Choromanski, K., Likhosherstov, V., et.al: Rethinking attention with performers. In: Arxiv Article (2021)
work page 2021
-
[6]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open- world semantic segmentation from only image-text pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
work page 2023
-
[7]
Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open-world semantic segmentation from only image-text pairs. Arxiv article (2023)
work page 2023
-
[8]
Advances in Neural Information Processing Systems 34, 7306–7318 (2021)
Chen, J., Niu, L., Liu, L., Zhang, L.: Weak-shot fine-grained classification via similarity transfer. Advances in Neural Information Processing Systems 34, 7306–7318 (2021)
work page 2021
-
[9]
In: Proceedings of the AAAI Conference on Artificial Intelligence, vol
Chen, J., Niu, L., Zhang, J., Si, J., Qian, C., Zhang, L.: Amodal instance segmentation via prior-guided expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 313–321 (2023)
work page 2023
-
[10]
Advances in Neural Information Processing Systems 11 35, 32525–32536 (2023)
Chen, J., Niu, L., Zhou, S., Si, J., Qian, C., Zhang, L.: Weak-shot semantic segmenta- tion via dual similarity transfer. Advances in Neural Information Processing Systems 11 35, 32525–32536 (2023)
work page 2023
-
[11]
arXiv preprint (2022) arXiv:2203.11068
Cun, X., Wang, Z., et.al: Learning enriched illuminants for cross and single sensor color constancy. Arxiv preprint (2022) Arxiv:2203.11068
-
[12]
Ding, L., Lin, D., et.al: Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images (2022)
work page 2022
-
[13]
Du, J., Liu, Y., et.al: Dependeval: Benchmarking llms for repository dependency understanding. Arxiv article (2025)
work page 2025
-
[14]
Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure
Fan, D.P., Zhang, S.C., et.al: Face sketch synthesis style similarity: A new structure co-occurrence texture measure. Arxiv preprint (2018) Arxiv:1804.02975
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[15]
Guan, T., Wang, J., et.al: M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Arxiv Article (2022)
work page 2022
-
[16]
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)
Ghorbanzadeh, O., Xu, Y., Zhao, H., Wang, J., Zhong, Y., Zhao, D., Zang, Q., et al.: The outcome of the 2022 landslide4sense competition: Advanced landslide detection from multisource satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)
work page 2022
-
[17]
Huang, Z., Ben, Y., et.al: Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer (2021)
work page 2021
-
[18]
He, H., Cai, J., Pan, Z., Liu, J., Zhang, J., Tao, D., Zhuang, B.: Dynamic focus-aware positional queries for semantic segmentation. Arxiv article (2023)
work page 2023
-
[19]
In: Proceedings of the ICCV, pp
He, H., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Sensitivity-aware visual parameter- efficient fine-tuning. In: Proceedings of the ICCV, pp. 11825–11835 (2023)
work page 2023
-
[20]
IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022)
He, P., Jiao, L., Shang, R., Wang, S., Liu, X., Quan, D., Yang, K., Zhao, D.: Manet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing 60, 1–15 (2022)
work page 2022
-
[21]
He, H., Liu, J., Pan, Z., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Pruning self-attentions into convolutional layers in single path. TPAMI (2024)
work page 2024
-
[22]
Computer Vision and Image Understanding 224, 103556 (2022)
Huang, X., Wang, Y., Li, S., Mei, G., Xu, Z., Wang, Y., Zhang, J., Bennamoun, M.: Robust real-world point cloud registration by inlier detection. Computer Vision and Image Understanding 224, 103556 (2022)
work page 2022
-
[23]
In: Proceedings of the AAAI (2020)
He, H., Zhang, J., Zhang, Q., Tao, D.: Grapy-ml: Graph pyramid mutual learning for cross-dataset human parsing. In: Proceedings of the AAAI (2020)
work page 2020
-
[24]
Jia, Y., Kaul, C., Lawton, T., Murray-Smith, R., Habli, I.: Prediction of weaning from mechanical ventilation using convolutional neural networks. Arxiv article (2021) 12
work page 2021
-
[25]
IEEE Transactions on Image Processing 30, 832–844 (2021)
Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hier- archical class activation maps. IEEE Transactions on Image Processing 30, 832–844 (2021)
work page 2021
-
[26]
Kim, C., Han, W., et.al: Eagle: Eigen aggregation learning for object-centric unsuper- vised semantic segmentation. Arxiv article (2024)
work page 2024
-
[27]
Kim, D., Ko, H., et.al: Fourier decomposition for explicit representation of 3d point cloud attributes. Arxiv preprint (2025)
work page 2025
-
[28]
Kaul, C., Mitton, J., et.al: Cpt: Convolutional point transformer for 3d point cloud processing. Arxiv article (2021)
work page 2021
-
[29]
Kaul, C., Manandhar, S., Pears, N.: Focusnet: An attention-based fully convolutional network for medical image segmentation. In: Arxiv Article (2019)
work page 2019
-
[30]
Kaul, C., Pears, N., Manandhar, S.: Sawnet: A spatially aware deep neural network for 3d point cloud processing. Arxiv article (2019)
work page 2019
-
[31]
Kaul, C., Pears, N., Manandhar, S.: Fatnet: A feature-attentive network for 3d point cloud processing. In: Arxiv Article (2021)
work page 2021
-
[32]
Advances in Neural Information Processing Systems 35, 30499–30511 (2022)
Kweon, H., Yoon, K.J.: Joint learning of 2d-3d weakly supervised semantic seg- mentation. Advances in Neural Information Processing Systems 35, 30499–30511 (2022)
work page 2022
-
[33]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
Kweon, H., Yoon, K.J.: From sam to cams: Exploring segment anything model for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)
work page 2024
-
[34]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordi- nary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
work page 2021
-
[35]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
work page 2021
-
[36]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
work page 2023
-
[37]
Lu, Z., He, Q., et.al: Defect detection of pcb based on bayes feature fusion. Arxiv article (2018)
work page 2018
-
[38]
Liu, X., Han, Z., Lee, S., Cao, Y.-P., Liu, Y.-S.: D-net: Learning for distinctive point 13 clouds by self-attentive point searching and learnable feature fusion. Arxiv article (2023)
work page 2023
-
[39]
Liu, X., Han, Z., Lee, S., Cao, Y.-P.: Point2sequence: Learning the shape representa- tion of 3d point clouds with an attention-based sequence to sequence network. In: Arxiv Article (2019)
work page 2019
-
[40]
Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. Arxiv article (2021)
work page 2021
-
[41]
In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)
Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation. In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)
work page 2022
-
[42]
IEEE Transactions on Multimedia 25, 1686–1699 (2022)
Li, J., Jie, Z., Wang, X., Zhou, Y., Wei, X., Ma, L.: Weakly supervised semantic segmentation via progressive patch learning. IEEE Transactions on Multimedia 25, 1686–1699 (2022)
work page 2022
-
[43]
Neurocomputing 561, 126821 (2023)
Li, J., Jie, Z., Wang, X., Zhou, Y., Ma, L., Jiang, J.: Weakly supervised semantic segmentation via self-supervised destruction learning. Neurocomputing 561, 126821 (2023)
work page 2023
-
[44]
Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., Deligianni, F.: Optimizing vision transformers for medical image segmentation. In: Arxiv Article (2023)
work page 2023
-
[45]
Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. Arxiv preprint (2021)
work page 2021
-
[46]
Lin, L., Liu, Y., Hu, Y., Yan, X., Xie, K., Huang, H.: Capturing, reconstructing, and simulating: the urbanscene3d dataset. In: Arxiv Article (2022)
work page 2022
-
[47]
Lu, D., Lu, X., Sun, Y., Wang, J.: Deep feature-preserving normal estimation for point cloud filtering. Arxiv article (2020)
work page 2020
-
[48]
Lee, S.H., Oh, G., et.al: Sound-guided semantic video generation. In: Arxiv Preprint (2022)
work page 2022
-
[49]
In: Proceedings of NeurIPS (2022)
Liu, J., Pan, Z., He, H., Cai, J., Zhuang, B.: Ecoformer: Energy-saving attention with linear complexity. In: Proceedings of NeurIPS (2022)
work page 2022
-
[50]
In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Lee, S.H., Roh, W., et.al: Sound-guided semantic image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
work page 2022
-
[51]
In: Proceedings of the 29th ACM International Conference on Multimedia, pp
Li, J., Wang, W., Chen, J., Niu, L., Si, J., Qian, C., Zhang, L.: Video semantic segmentation via sparse temporal transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 59–68 (2021) 14
work page 2021
-
[52]
Applied Intelligence 53(18), 20753–20765 (2023)
Li, X., Wu, Y., Dai, S.: Semi-supervised medical imaging segmentation with soft pseudo-label fusion. Applied Intelligence 53(18), 20753–20765 (2023)
work page 2023
-
[53]
Li, J., Wu, J., et.al: Partglee: A foundation model for recognizing and parsing any objects. In: Arxiv Preprint (2022)
work page 2022
-
[54]
Li, K., Wang, Y., et.al: Uniformer: Unifying convolution and self-attention for visual recognition (2022)
work page 2022
-
[55]
Machine Intelligence Research (2023)
Liu, Y., Wu, Y.H., et.al: Vision transformers with hierarchical attention. Machine Intelligence Research (2023)
work page 2023
-
[56]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)
Liu, L., Wang, Z., Phan, M.H., Zhang, B., Ge, J., Liu, Y.: Bpkd: Boundary privileged knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)
work page 2024
-
[57]
Lu, D., Xie, Q., et.al: 3dctn: 3d convolution-transformer network for point cloud classification. Arxiv article (2022)
work page 2022
-
[58]
Advances in Neural Information Processing Systems 34, 3978–3990 (2021)
Liu, Y., Zhang, Z., Niu, L., Chen, J., Zhang, L.: Mixed supervised object detection by transferring mask prior and semantic similarity. Advances in Neural Information Processing Systems 34, 3978–3990 (2021)
work page 2021
-
[59]
Mukhoti, J., Lin, T.-Y., Poursaeed, O., Wang, R., Shah, A., Torr, P.H.S., Lim, S.-N.: Open vocabulary semantic segmentation with patch aligned contrastive learning. Arxiv article (2023)
work page 2023
-
[60]
Mommert, M., Scheibenreif, L., Hanna, J., Borth, D.: Power plant classification from remote imaging with deep learning. In: Arxiv Article (2021)
work page 2021
-
[61]
In: Proceedings of the AAAI Conference on Artificial Intelligence, vol
Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with localized style representations and factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2393–2402 (2021)
work page 2021
-
[62]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Multiple heads are better than one: Few-shot font generation with multiple localized experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
work page 2021
-
[63]
IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)
Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with weakly supervised localized representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 46(3), 1479–1495 (2023)
work page 2023
-
[64]
Pang, J., Liu, W., et.al: Mcnet: Magnitude consistency network for domain adaptive object detection under inclement environments. Arxiv article (2023)
work page 2023
-
[65]
In: Proceedings of the 27th ACM International Conference on Multimedia, pp
Park, K., Woo, S., Kim, D., Cho, D., Kweon, I.S.: Preserving semantic and tempo- ral consistency for unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1248–1257 (2019) 15
work page 2019
-
[66]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmenta- tion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
work page 2022
-
[67]
Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)
Park, K., Woo, S., Shin, I., Kweon, I.S.: Discover, hallucinate, and adapt: Open compound domain adaptation for semantic segmentation. Advances in Neural Information Processing Systems (NeurIPS) 33, 10869–10880 (2020)
work page 2020
-
[68]
In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp
Palasek, P., Yang, H., Xu, Z., Hajimirza, N., Izquierdo, E., Patras, I.: A flexible cal- ibration method of multiple kinects for 3d human reconstruction. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4 (2015)
work page 2015
-
[69]
In: 2015 IEEE International Conference on Image Processing (ICIP), pp
Peng, Y.T., Zhao, X., Cosman, P.C.: Single underwater image enhancement using depth estimation based on blurriness. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4952–4956 (2015)
work page 2015
-
[70]
In: Proceedings of the AAAI (2022)
Pan, Z., Zhuang, B., He, H., Liu, J., Cai, J.: Less is more: Pay less attention in vision transformers. In: Proceedings of the AAAI (2022)
work page 2022
-
[71]
In: Proceedings of the ICCV (2021)
Pan, Z., Zhuang, B., Liu, J., He, H., Cai, J.: Scalable vision transformers with hierarchical pooling. In: Proceedings of the ICCV (2021)
work page 2021
-
[72]
Ranftl, R., Bochkovskiy, A., et.al: Vision transformers for dense prediction. In: Arxiv Article (2021)
work page 2021
-
[73]
Riz, L., Saltori, C., Ricci, E., Poiesi, F.: Novel class discovery for 3d point cloud semantic segmentation. Arxiv article (2023)
work page 2023
-
[74]
Journal of Chemical Information and Modeling (2021)
Sacha, M., B laz, M., Byrski, P., Dabrowski-Tumanski, P., Chrominski, M., et al.: Molecule edit graph attention network: Modeling chemical reactions as sequences of graph edits. Journal of Chemical Information and Modeling (2021)
work page 2021
-
[75]
Strudel, R., Garcia, R., et.al: Segmenter: Transformer for semantic segmentation. In: Arxiv Article (2021)
work page 2021
-
[76]
Scheibenreif, L., Hanna, J., et.al: Self-supervised vision transformers for land-cover segmentation and classification. In: Arxiv Article (2022)
work page 2022
-
[77]
In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)
Sacha, M., Jura, B., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Inter- pretability benchmark for evaluating spatial misalignment of prototypical parts explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)
work page 2024
-
[78]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021) 16
Shin, I., Kim, D.J., Cho, J.W., Woo, S., Park, K., Kweon, I.S.: Labor: Labeling only if required for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021) 16
work page 2021
-
[79]
Scheibenreif, L., Mommert, M., Borth, D.: Contrastive self-supervised data fusion for satellite imagery. Arxiv article (2022)
work page 2022
-
[80]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)
Sacha, M., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Protoseg: Interpretable semantic segmentation with prototypical parts. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.