pith. the verified trust layer for science. sign in

arxiv: 2508.06819 · v3 · submitted 2025-08-09 · 💻 cs.CV

VesselRW: Weakly Supervised Subcutaneous Vessel Segmentation via Learned Random Walk Propagation

Pith reviewed 2026-05-19 00:33 UTC · model grok-4.3

classification 💻 cs.CV
keywords vessel segmentationweakly supervisedrandom walklabel propagationsubcutaneous imaginguncertainty estimationtopology aware
0
0 comments X p. Extension

The pith

Differentiable random walk propagation expands sparse vessel annotations into dense probabilistic supervision for subcutaneous segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a weakly supervised framework for segmenting subcutaneous vessels from clinical images that uses only inexpensive sparse annotations such as centerlines or scribbles. These sparse marks are turned into dense training signals by a random walk model that spreads labels according to vessel-like patterns in the image data. The model is trained end-to-end with a convolutional network so that the propagation learns to respect tubular continuity and produce uncertainty maps. An added regularizer keeps the resulting vessel trees connected and free of stray branches. The result is segmentation performance that beats simple sparse training and standard pseudo-label approaches on real patient data.

Core claim

By jointly training a CNN segmentation network with a differentiable random walk label propagation model, sparse annotations can be expanded into accurate dense probabilistic maps that integrate vesselness cues and tubular continuity priors from the image data, yielding superior vascular segmentation and calibrated uncertainty estimates without requiring explicit edge labels or dense ground truth.

What carries the argument

Differentiable random walk label propagation model that computes per-pixel hitting probabilities from sparse seeds using image data to enforce vessel continuity and produce uncertainty estimates for the loss.

If this is right

  • Segmentation networks can be trained effectively with far fewer annotation resources than full dense labeling.
  • The produced uncertainty maps highlight regions where predictions are less reliable, aiding clinical review.
  • Topology regularization maintains vessel connectivity important for downstream vascular analysis.
  • Performance gains over naive sparse and pseudo-label methods validate the integrated propagation approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar propagation techniques could apply to segmenting other linear structures such as nerves or roads in aerial images.
  • End-to-end training might allow the system to adapt to new imaging modalities with minimal additional supervision.
  • Improved uncertainty calibration could support risk-aware decision systems in medical procedures.

Load-bearing premise

Image features alone supply enough information about vessel locations and connections for the random walk to generate reliable dense labels from sparse starting points.

What would settle it

A comparison of the method's predicted vessel maps to expert-drawn dense annotations on a new set of clinical images, measuring whether overlap and connectivity metrics exceed those of baseline methods.

read the original abstract

The task of parsing subcutaneous vessels in clinical images is often hindered by the high cost and limited availability of ground truth data, as well as the challenge of low contrast and noisy vessel appearances across different patients and imaging modalities. In this work, we propose a novel weakly supervised training framework specifically designed for subcutaneous vessel segmentation. This method utilizes low-cost, sparse annotations such as centerline traces, dot markers, or short scribbles to guide the learning process. These sparse annotations are expanded into dense probabilistic supervision through a differentiable random walk label propagation model, which integrates vesselness cues and tubular continuity priors driven by image data. The label propagation process results in per-pixel hitting probabilities and uncertainty estimates, which are incorporated into an uncertainty-weighted loss function to prevent overfitting in ambiguous areas. Notably, the label propagation model is trained jointly with a CNN-based segmentation network, allowing the system to learn vessel boundaries and continuity constraints without the need for explicit edge supervision. Additionally, we introduce a topology-aware regularizer that encourages centerline connectivity and penalizes irrelevant branches, further enhancing clinical applicability. Our experiments on clinical subcutaneous imaging datasets demonstrate that our approach consistently outperforms both naive sparse-label training and traditional dense pseudo-labeling methods, yielding more accurate vascular maps and better-calibrated uncertainty, which is crucial for clinical decision-making. This method significantly reduces the annotation workload while maintaining clinically relevant vessel topology.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces VesselRW, a weakly supervised framework for subcutaneous vessel segmentation in clinical images. Sparse annotations (centerlines, dots, or scribbles) are expanded into dense probabilistic supervision via a differentiable random walk propagation model that incorporates learned vesselness cues and tubular continuity priors from the image data. The propagation model is trained jointly with a CNN segmentation network; an uncertainty-weighted loss prevents overfitting in ambiguous regions, and a topology-aware regularizer encourages centerline connectivity. Experiments on clinical subcutaneous imaging datasets are reported to show consistent outperformance over naive sparse-label training and traditional dense pseudo-labeling, with improved vascular map accuracy and better-calibrated uncertainty.

Significance. If the central claims hold, the work could meaningfully reduce annotation burden for vessel segmentation in medical imaging while preserving clinically relevant topology. The joint optimization of a learned random-walk propagator with a CNN is a distinctive technical element that, if shown to produce reliable dense supervision rather than being compensated by the network, would strengthen the case for learned propagation in weak-supervision pipelines.

major comments (2)
  1. [Experiments] The central claim that the learned random walk supplies faithful dense supervision from sparse labels is load-bearing, yet the manuscript provides no diagnostic that isolates the propagation step (e.g., overlap of propagated maps with held-out dense annotations or direct comparison against a fixed-affinity random-walk baseline). Without such evidence it remains possible that the CNN simply learns to ignore or correct noisy pseudo-labels, undermining the attribution of gains to the propagation model.
  2. [Results] The abstract and results claim superior uncertainty calibration, but no quantitative calibration metrics (e.g., expected calibration error or reliability diagrams) or comparison against the baselines are referenced. This omission makes it impossible to verify whether the uncertainty-weighted loss and learned propagation actually improve calibration or merely correlate with the reported accuracy gains.
minor comments (2)
  1. [Method] The description of the random-walk affinities and the precise form of the uncertainty-weighted loss would benefit from an explicit equation or pseudocode block to clarify how vesselness and continuity priors are encoded.
  2. [Experiments] The clinical datasets are referred to only generically; adding a table or paragraph with imaging modality, resolution, number of patients, and annotation density would improve reproducibility and allow readers to assess generalizability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below and outline targeted revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Experiments] The central claim that the learned random walk supplies faithful dense supervision from sparse labels is load-bearing, yet the manuscript provides no diagnostic that isolates the propagation step (e.g., overlap of propagated maps with held-out dense annotations or direct comparison against a fixed-affinity random-walk baseline). Without such evidence it remains possible that the CNN simply learns to ignore or correct noisy pseudo-labels, undermining the attribution of gains to the propagation model.

    Authors: We agree that isolating the contribution of the learned random walk is important for attributing performance gains. In the revised manuscript we will add a direct comparison against a fixed-affinity random-walk baseline that uses the same propagation architecture but with non-learned affinities. On the subset of our clinical datasets that include held-out dense annotations, we will also report overlap metrics (Dice and IoU) between the propagated probabilistic maps and these dense labels to demonstrate the fidelity of the supervision signal. revision: yes

  2. Referee: [Results] The abstract and results claim superior uncertainty calibration, but no quantitative calibration metrics (e.g., expected calibration error or reliability diagrams) or comparison against the baselines are referenced. This omission makes it impossible to verify whether the uncertainty-weighted loss and learned propagation actually improve calibration or merely correlate with the reported accuracy gains.

    Authors: We acknowledge that quantitative calibration evidence is needed to support the claim. In the revision we will add expected calibration error (ECE) values and reliability diagrams for VesselRW and all baselines. These metrics will be computed on the test sets and will allow direct verification of whether the uncertainty-weighted loss and joint training improve calibration beyond the observed accuracy improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity; propagation model provides independent image-driven signal

full rationale

The framework expands sparse annotations via a differentiable random walk that incorporates vesselness cues and tubular continuity priors directly from the image data. This propagation produces per-pixel hitting probabilities used as supervision for the CNN, with joint training allowing the affinities to adapt to the data rather than being fixed or self-referential. Experiments explicitly compare against naive sparse-label training and traditional dense pseudo-labeling baselines on clinical datasets, with reported gains in accuracy and uncertainty calibration. No equations or steps in the abstract reduce the output supervision to a fit of the target labels by construction, and no self-citation chain is invoked to justify uniqueness or force the result. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the premise that image-derived vesselness and continuity cues are sufficient to turn sparse marks into reliable dense supervision when the propagation and segmentation networks are trained together; no explicit free parameters or invented entities are named in the abstract.

axioms (2)
  • domain assumption Sparse annotations such as centerlines or scribbles can be expanded into dense per-pixel hitting probabilities by a random walk that follows vesselness cues and tubular continuity priors present in the image data.
    This is the core mechanism stated in the abstract for generating the probabilistic supervision used in the loss.
  • domain assumption Joint training of the propagation model with the CNN allows learning of vessel boundaries and continuity without explicit edge supervision.
    Abstract claims this joint optimization removes the need for additional edge labels.

pith-pipeline@v0.9.0 · 5800 in / 1550 out tokens · 45660 ms · 2026-05-19T00:33:51.062902+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

161 extracted references · 161 canonical work pages · 2 internal anchors

  1. [1]

    Arxiv article (2023)

    Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., Marlet, R.: Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Arxiv article (2023)

  2. [2]

    Bazi, Y., Bashmal, L., et.al: Vision transformers for remote sensing image classification (2021)

  3. [3]

    Arxiv article (2024) 11

    Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuili` ere, S.: Collaborating foundation models for domain generalized semantic segmentation. Arxiv article (2024) 11

  4. [4]

    In: European Conference on Computer Vision (ECCV) (2020)

    Cha, J., Chun, S., Lee, G., Lee, B., Kim, S., Lee, H.: Few-shot compositional font gen- eration with dual memory. In: European Conference on Computer Vision (ECCV) (2020)

  5. [5]

    In: Arxiv Article (2021)

    Choromanski, K., Likhosherstov, V., et.al: Rethinking attention with performers. In: Arxiv Article (2021)

  6. [6]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open- world semantic segmentation from only image-text pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  7. [7]

    Arxiv article (2023)

    Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open-world semantic segmentation from only image-text pairs. Arxiv article (2023)

  8. [8]

    Advances in Neural Information Processing Systems34, 7306–7318 (2021)

    Chen, J., Niu, L., Liu, L., Zhang, L.: Weak-shot fine-grained classification via similarity transfer. Advances in Neural Information Processing Systems34, 7306–7318 (2021)

  9. [9]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Chen, J., Niu, L., Zhang, J., Si, J., Qian, C., Zhang, L.: Amodal instance segmentation via prior-guided expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 313–321 (2023)

  10. [10]

    Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

    Chen, J., Niu, L., Zhou, S., Si, J., Qian, C., Zhang, L.: Weak-shot semantic segmenta- tion via dual similarity transfer. Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

  11. [11]

    arXiv preprint (2022) arXiv:2203.11068

    Cun, X., Wang, Z., et.al: Learning enriched illuminants for cross and single sensor color constancy. Arxiv preprint (2022) Arxiv:2203.11068

  12. [12]

    Ding, L., Lin, D., et.al: Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images (2022)

  13. [13]

    Arxiv article (2025)

    Du, J., Liu, Y., et.al: Dependeval: Benchmarking llms for repository dependency understanding. Arxiv article (2025)

  14. [14]

    Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure

    Fan, D.P., Zhang, S.C., et.al: Face sketch synthesis style similarity: A new structure co-occurrence texture measure. Arxiv preprint (2018) Arxiv:1804.02975

  15. [15]

    In: Arxiv Article (2022)

    Guan, T., Wang, J., et.al: M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Arxiv Article (2022)

  16. [16]

    IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

    Ghorbanzadeh, O., Xu, Y., Zhao, H., Wang, J., Zhong, Y., Zhao, D., Zang, Q., et al.: The outcome of the 2022 landslide4sense competition: Advanced landslide detection from multisource satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

  17. [17]

    Huang, Z., Ben, Y., et.al: Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer (2021) 12

  18. [18]

    Arxiv article (2023)

    He, H., Cai, J., Pan, Z., Liu, J., Zhang, J., Tao, D., Zhuang, B.: Dynamic focus-aware positional queries for semantic segmentation. Arxiv article (2023)

  19. [19]

    In: Proceedings of the ICCV, pp

    He, H., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Sensitivity-aware visual parameter- efficient fine-tuning. In: Proceedings of the ICCV, pp. 11825–11835 (2023)

  20. [20]

    IEEE Transactions on Geoscience and Remote Sensing60, 1–15 (2022)

    He, P., Jiao, L., Shang, R., Wang, S., Liu, X., Quan, D., Yang, K., Zhao, D.: Manet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing60, 1–15 (2022)

  21. [21]

    TPAMI (2024)

    He, H., Liu, J., Pan, Z., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Pruning self-attentions into convolutional layers in single path. TPAMI (2024)

  22. [22]

    Computer Vision and Image Understanding224, 103556 (2022)

    Huang, X., Wang, Y., Li, S., Mei, G., Xu, Z., Wang, Y., Zhang, J., Bennamoun, M.: Robust real-world point cloud registration by inlier detection. Computer Vision and Image Understanding224, 103556 (2022)

  23. [23]

    In: Proceedings of the AAAI (2020)

    He, H., Zhang, J., Zhang, Q., Tao, D.: Grapy-ml: Graph pyramid mutual learning for cross-dataset human parsing. In: Proceedings of the AAAI (2020)

  24. [24]

    Arxiv article (2021)

    Jia, Y., Kaul, C., Lawton, T., Murray-Smith, R., Habli, I.: Prediction of weaning from mechanical ventilation using convolutional neural networks. Arxiv article (2021)

  25. [25]

    IEEE Transactions on Image Processing30, 832–844 (2021)

    Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hier- archical class activation maps. IEEE Transactions on Image Processing30, 832–844 (2021)

  26. [26]

    Arxiv article (2024)

    Kim, C., Han, W., et.al: Eagle: Eigen aggregation learning for object-centric unsuper- vised semantic segmentation. Arxiv article (2024)

  27. [27]

    Arxiv preprint (2025)

    Kim, D., Ko, H., et.al: Fourier decomposition for explicit representation of 3d point cloud attributes. Arxiv preprint (2025)

  28. [28]

    Arxiv article (2021)

    Kaul, C., Mitton, J., et.al: Cpt: Convolutional point transformer for 3d point cloud processing. Arxiv article (2021)

  29. [29]

    In: Arxiv Article (2019)

    Kaul, C., Manandhar, S., Pears, N.: Focusnet: An attention-based fully convolutional network for medical image segmentation. In: Arxiv Article (2019)

  30. [30]

    Arxiv article (2019)

    Kaul, C., Pears, N., Manandhar, S.: Sawnet: A spatially aware deep neural network for 3d point cloud processing. Arxiv article (2019)

  31. [31]

    In: Arxiv Article (2021)

    Kaul, C., Pears, N., Manandhar, S.: Fatnet: A feature-attentive network for 3d point cloud processing. In: Arxiv Article (2021)

  32. [32]

    Advances in Neural Information Processing Systems35, 30499–30511 (2022) 13

    Kweon, H., Yoon, K.J.: Joint learning of 2d-3d weakly supervised semantic seg- mentation. Advances in Neural Information Processing Systems35, 30499–30511 (2022) 13

  33. [33]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

    Kweon, H., Yoon, K.J.: From sam to cams: Exploring segment anything model for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

  34. [34]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

    Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordi- nary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

  35. [35]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

  36. [36]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

  37. [37]

    Arxiv article (2018)

    Lu, Z., He, Q., et.al: Defect detection of pcb based on bayes feature fusion. Arxiv article (2018)

  38. [38]

    Arxiv article (2023)

    Liu, X., Han, Z., Lee, S., Cao, Y.-P., Liu, Y.-S.: D-net: Learning for distinctive point clouds by self-attentive point searching and learnable feature fusion. Arxiv article (2023)

  39. [39]

    In: Arxiv Article (2019)

    Liu, X., Han, Z., Lee, S., Cao, Y.-P.: Point2sequence: Learning the shape representa- tion of 3d point clouds with an attention-based sequence to sequence network. In: Arxiv Article (2019)

  40. [40]

    Arxiv article (2021)

    Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. Arxiv article (2021)

  41. [41]

    In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

    Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation. In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

  42. [42]

    IEEE Transactions on Multimedia25, 1686–1699 (2022)

    Li, J., Jie, Z., Wang, X., Zhou, Y., Wei, X., Ma, L.: Weakly supervised semantic segmentation via progressive patch learning. IEEE Transactions on Multimedia25, 1686–1699 (2022)

  43. [43]

    Neurocomputing561, 126821 (2023)

    Li, J., Jie, Z., Wang, X., Zhou, Y., Ma, L., Jiang, J.: Weakly supervised semantic segmentation via self-supervised destruction learning. Neurocomputing561, 126821 (2023)

  44. [44]

    In: Arxiv Article (2023) 14

    Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., Deligianni, F.: Optimizing vision transformers for medical image segmentation. In: Arxiv Article (2023) 14

  45. [45]

    Arxiv preprint (2021)

    Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. Arxiv preprint (2021)

  46. [46]

    In: Arxiv Article (2022)

    Lin, L., Liu, Y., Hu, Y., Yan, X., Xie, K., Huang, H.: Capturing, reconstructing, and simulating: the urbanscene3d dataset. In: Arxiv Article (2022)

  47. [47]

    Arxiv article (2020)

    Lu, D., Lu, X., Sun, Y., Wang, J.: Deep feature-preserving normal estimation for point cloud filtering. Arxiv article (2020)

  48. [48]

    In: Arxiv Preprint (2022)

    Lee, S.H., Oh, G., et.al: Sound-guided semantic video generation. In: Arxiv Preprint (2022)

  49. [49]

    In: Proceedings of NeurIPS (2022)

    Liu, J., Pan, Z., He, H., Cai, J., Zhuang, B.: Ecoformer: Energy-saving attention with linear complexity. In: Proceedings of NeurIPS (2022)

  50. [50]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Lee, S.H., Roh, W., et.al: Sound-guided semantic image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  51. [51]

    In: Proceedings of the 29th ACM International Conference on Multimedia, pp

    Li, J., Wang, W., Chen, J., Niu, L., Si, J., Qian, C., Zhang, L.: Video semantic segmentation via sparse temporal transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 59–68 (2021)

  52. [52]

    Applied Intelligence53(18), 20753–20765 (2023)

    Li, X., Wu, Y., Dai, S.: Semi-supervised medical imaging segmentation with soft pseudo-label fusion. Applied Intelligence53(18), 20753–20765 (2023)

  53. [53]

    In: Arxiv Preprint (2022)

    Li, J., Wu, J., et.al: Partglee: A foundation model for recognizing and parsing any objects. In: Arxiv Preprint (2022)

  54. [54]

    Li, K., Wang, Y., et.al: Uniformer: Unifying convolution and self-attention for visual recognition (2022)

  55. [55]

    Machine Intelligence Research (2023)

    Liu, Y., Wu, Y.H., et.al: Vision transformers with hierarchical attention. Machine Intelligence Research (2023)

  56. [56]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

    Liu, L., Wang, Z., Phan, M.H., Zhang, B., Ge, J., Liu, Y.: Bpkd: Boundary privileged knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

  57. [57]

    Arxiv article (2022)

    Lu, D., Xie, Q., et.al: 3dctn: 3d convolution-transformer network for point cloud classification. Arxiv article (2022)

  58. [58]

    Advances in Neural Information Processing Systems34, 3978–3990 (2021)

    Liu, Y., Zhang, Z., Niu, L., Chen, J., Zhang, L.: Mixed supervised object detection by transferring mask prior and semantic similarity. Advances in Neural Information Processing Systems34, 3978–3990 (2021)

  59. [59]

    Arxiv article (2023) 15

    Mukhoti, J., Lin, T.-Y., Poursaeed, O., Wang, R., Shah, A., Torr, P.H.S., Lim, S.-N.: Open vocabulary semantic segmentation with patch aligned contrastive learning. Arxiv article (2023) 15

  60. [60]

    In: Arxiv Article (2021)

    Mommert, M., Scheibenreif, L., Hanna, J., Borth, D.: Power plant classification from remote imaging with deep learning. In: Arxiv Article (2021)

  61. [61]

    In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with localized style representations and factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2393–2402 (2021)

  62. [62]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Multiple heads are better than one: Few-shot font generation with multiple localized experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

  63. [63]

    IEEE Transactions on Pattern Analysis and Machine Intelligence46(3), 1479–1495 (2023)

    Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with weakly supervised localized representations. IEEE Transactions on Pattern Analysis and Machine Intelligence46(3), 1479–1495 (2023)

  64. [64]

    Arxiv article (2023)

    Pang, J., Liu, W., et.al: Mcnet: Magnitude consistency network for domain adaptive object detection under inclement environments. Arxiv article (2023)

  65. [65]

    In: Proceedings of the 27th ACM International Conference on Multimedia, pp

    Park, K., Woo, S., Kim, D., Cho, D., Kweon, I.S.: Preserving semantic and tempo- ral consistency for unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1248–1257 (2019)

  66. [66]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmenta- tion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

  67. [67]

    Advances in Neural Information Processing Systems (NeurIPS)33, 10869–10880 (2020)

    Park, K., Woo, S., Shin, I., Kweon, I.S.: Discover, hallucinate, and adapt: Open compound domain adaptation for semantic segmentation. Advances in Neural Information Processing Systems (NeurIPS)33, 10869–10880 (2020)

  68. [68]

    In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp

    Palasek, P., Yang, H., Xu, Z., Hajimirza, N., Izquierdo, E., Patras, I.: A flexible cal- ibration method of multiple kinects for 3d human reconstruction. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4 (2015)

  69. [69]

    In: 2015 IEEE International Conference on Image Processing (ICIP), pp

    Peng, Y.T., Zhao, X., Cosman, P.C.: Single underwater image enhancement using depth estimation based on blurriness. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4952–4956 (2015)

  70. [70]

    In: Proceedings of the AAAI (2022)

    Pan, Z., Zhuang, B., He, H., Liu, J., Cai, J.: Less is more: Pay less attention in vision transformers. In: Proceedings of the AAAI (2022)

  71. [71]

    In: Proceedings of the ICCV (2021)

    Pan, Z., Zhuang, B., Liu, J., He, H., Cai, J.: Scalable vision transformers with hierarchical pooling. In: Proceedings of the ICCV (2021)

  72. [72]

    In: Arxiv Article (2021) 16

    Ranftl, R., Bochkovskiy, A., et.al: Vision transformers for dense prediction. In: Arxiv Article (2021) 16

  73. [73]

    Arxiv article (2023)

    Riz, L., Saltori, C., Ricci, E., Poiesi, F.: Novel class discovery for 3d point cloud semantic segmentation. Arxiv article (2023)

  74. [74]

    Journal of Chemical Information and Modeling (2021)

    Sacha, M., B laz, M., Byrski, P., Dabrowski-Tumanski, P., Chrominski, M., et al.: Molecule edit graph attention network: Modeling chemical reactions as sequences of graph edits. Journal of Chemical Information and Modeling (2021)

  75. [75]

    In: Arxiv Article (2021)

    Strudel, R., Garcia, R., et.al: Segmenter: Transformer for semantic segmentation. In: Arxiv Article (2021)

  76. [76]

    In: Arxiv Article (2022)

    Scheibenreif, L., Hanna, J., et.al: Self-supervised vision transformers for land-cover segmentation and classification. In: Arxiv Article (2022)

  77. [77]

    In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

    Sacha, M., Jura, B., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Inter- pretability benchmark for evaluating spatial misalignment of prototypical parts explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

  78. [78]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

    Shin, I., Kim, D.J., Cho, J.W., Woo, S., Park, K., Kweon, I.S.: Labor: Labeling only if required for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

  79. [79]

    Arxiv article (2022)

    Scheibenreif, L., Mommert, M., Borth, D.: Contrastive self-supervised data fusion for satellite imagery. Arxiv article (2022)

  80. [80]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)

    Sacha, M., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Protoseg: Interpretable semantic segmentation with prototypical parts. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)

Showing first 80 references.