arxiv: 2508.06819 · v3 · submitted 2025-08-09 · 💻 cs.CV

VesselRW: Weakly Supervised Subcutaneous Vessel Segmentation via Learned Random Walk Propagation

Ayaan Nooruddin Siddiqui , Mahnoor Zaidi , Ayesha Nazneen Shahbaz , Priyadarshini Chatterjee , Krishnan Menon Iyer This is my paper

Pith reviewed 2026-05-19 00:33 UTC · model grok-4.3

classification 💻 cs.CV

keywords vessel segmentationweakly supervisedrandom walklabel propagationsubcutaneous imaginguncertainty estimationtopology aware

0 comments p. Extension

The pith

Differentiable random walk propagation expands sparse vessel annotations into dense probabilistic supervision for subcutaneous segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a weakly supervised framework for segmenting subcutaneous vessels from clinical images that uses only inexpensive sparse annotations such as centerlines or scribbles. These sparse marks are turned into dense training signals by a random walk model that spreads labels according to vessel-like patterns in the image data. The model is trained end-to-end with a convolutional network so that the propagation learns to respect tubular continuity and produce uncertainty maps. An added regularizer keeps the resulting vessel trees connected and free of stray branches. The result is segmentation performance that beats simple sparse training and standard pseudo-label approaches on real patient data.

Core claim

By jointly training a CNN segmentation network with a differentiable random walk label propagation model, sparse annotations can be expanded into accurate dense probabilistic maps that integrate vesselness cues and tubular continuity priors from the image data, yielding superior vascular segmentation and calibrated uncertainty estimates without requiring explicit edge labels or dense ground truth.

What carries the argument

Differentiable random walk label propagation model that computes per-pixel hitting probabilities from sparse seeds using image data to enforce vessel continuity and produce uncertainty estimates for the loss.

If this is right

Segmentation networks can be trained effectively with far fewer annotation resources than full dense labeling.
The produced uncertainty maps highlight regions where predictions are less reliable, aiding clinical review.
Topology regularization maintains vessel connectivity important for downstream vascular analysis.
Performance gains over naive sparse and pseudo-label methods validate the integrated propagation approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar propagation techniques could apply to segmenting other linear structures such as nerves or roads in aerial images.
End-to-end training might allow the system to adapt to new imaging modalities with minimal additional supervision.
Improved uncertainty calibration could support risk-aware decision systems in medical procedures.

Load-bearing premise

Image features alone supply enough information about vessel locations and connections for the random walk to generate reliable dense labels from sparse starting points.

What would settle it

A comparison of the method's predicted vessel maps to expert-drawn dense annotations on a new set of clinical images, measuring whether overlap and connectivity metrics exceed those of baseline methods.

read the original abstract

The task of parsing subcutaneous vessels in clinical images is often hindered by the high cost and limited availability of ground truth data, as well as the challenge of low contrast and noisy vessel appearances across different patients and imaging modalities. In this work, we propose a novel weakly supervised training framework specifically designed for subcutaneous vessel segmentation. This method utilizes low-cost, sparse annotations such as centerline traces, dot markers, or short scribbles to guide the learning process. These sparse annotations are expanded into dense probabilistic supervision through a differentiable random walk label propagation model, which integrates vesselness cues and tubular continuity priors driven by image data. The label propagation process results in per-pixel hitting probabilities and uncertainty estimates, which are incorporated into an uncertainty-weighted loss function to prevent overfitting in ambiguous areas. Notably, the label propagation model is trained jointly with a CNN-based segmentation network, allowing the system to learn vessel boundaries and continuity constraints without the need for explicit edge supervision. Additionally, we introduce a topology-aware regularizer that encourages centerline connectivity and penalizes irrelevant branches, further enhancing clinical applicability. Our experiments on clinical subcutaneous imaging datasets demonstrate that our approach consistently outperforms both naive sparse-label training and traditional dense pseudo-labeling methods, yielding more accurate vascular maps and better-calibrated uncertainty, which is crucial for clinical decision-making. This method significantly reduces the annotation workload while maintaining clinically relevant vessel topology.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper jointly trains a differentiable random walk to expand sparse vessel annotations into dense supervision alongside a CNN, but the end-to-end setup leaves open whether the propagation step itself drives the gains or gets masked by the other losses.

read the letter

The main point is a weakly supervised pipeline for subcutaneous vessel segmentation that learns random-walk propagation from the image data rather than relying on fixed affinities. Sparse inputs like centerlines or scribbles get turned into per-pixel hitting probabilities, which then supervise a CNN through an uncertainty-weighted loss, plus a topology regularizer that pushes for connected centerlines and drops stray branches. The joint training lets the propagation adapt to vesselness cues and tubular continuity without separate edge labels. This is a targeted move for a setting where dense ground truth is expensive and low-contrast images make manual labeling slow. The approach does address a practical bottleneck in clinical vascular mapping and reports better accuracy and uncertainty calibration than plain sparse training or off-the-shelf pseudo-labeling. Those are real advantages if the numbers hold up on the clinical datasets they used. The soft spot is the lack of clear isolation for the propagation component. Because the CNN trains jointly and the loss down-weights uncertain pixels while the regularizer enforces topology, it is possible for the segmentation network to compensate for weak or noisy propagated labels. Without separate checks, such as overlap between the learned hitting probabilities and held-out dense annotations or direct comparisons against fixed-affinity random walks, it is hard to know how much the learned affinities are actually contributing versus the rest of the model smoothing things over. The abstract claims consistent outperformance, but the strength of that claim rests on seeing the ablations and baseline details. This paper is for groups working on weakly supervised segmentation of thin structures in medical images. A reader who needs concrete ways to reduce annotation effort for vessels would find usable ideas here. It is coherent enough and grounded in a real clinical need to deserve peer review, mainly to verify the experimental controls and confirm the propagation is not being bypassed.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces VesselRW, a weakly supervised framework for subcutaneous vessel segmentation in clinical images. Sparse annotations (centerlines, dots, or scribbles) are expanded into dense probabilistic supervision via a differentiable random walk propagation model that incorporates learned vesselness cues and tubular continuity priors from the image data. The propagation model is trained jointly with a CNN segmentation network; an uncertainty-weighted loss prevents overfitting in ambiguous regions, and a topology-aware regularizer encourages centerline connectivity. Experiments on clinical subcutaneous imaging datasets are reported to show consistent outperformance over naive sparse-label training and traditional dense pseudo-labeling, with improved vascular map accuracy and better-calibrated uncertainty.

Significance. If the central claims hold, the work could meaningfully reduce annotation burden for vessel segmentation in medical imaging while preserving clinically relevant topology. The joint optimization of a learned random-walk propagator with a CNN is a distinctive technical element that, if shown to produce reliable dense supervision rather than being compensated by the network, would strengthen the case for learned propagation in weak-supervision pipelines.

major comments (2)

[Experiments] The central claim that the learned random walk supplies faithful dense supervision from sparse labels is load-bearing, yet the manuscript provides no diagnostic that isolates the propagation step (e.g., overlap of propagated maps with held-out dense annotations or direct comparison against a fixed-affinity random-walk baseline). Without such evidence it remains possible that the CNN simply learns to ignore or correct noisy pseudo-labels, undermining the attribution of gains to the propagation model.
[Results] The abstract and results claim superior uncertainty calibration, but no quantitative calibration metrics (e.g., expected calibration error or reliability diagrams) or comparison against the baselines are referenced. This omission makes it impossible to verify whether the uncertainty-weighted loss and learned propagation actually improve calibration or merely correlate with the reported accuracy gains.

minor comments (2)

[Method] The description of the random-walk affinities and the precise form of the uncertainty-weighted loss would benefit from an explicit equation or pseudocode block to clarify how vesselness and continuity priors are encoded.
[Experiments] The clinical datasets are referred to only generically; adding a table or paragraph with imaging modality, resolution, number of patients, and annotation density would improve reproducibility and allow readers to assess generalizability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below and outline targeted revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Experiments] The central claim that the learned random walk supplies faithful dense supervision from sparse labels is load-bearing, yet the manuscript provides no diagnostic that isolates the propagation step (e.g., overlap of propagated maps with held-out dense annotations or direct comparison against a fixed-affinity random-walk baseline). Without such evidence it remains possible that the CNN simply learns to ignore or correct noisy pseudo-labels, undermining the attribution of gains to the propagation model.

Authors: We agree that isolating the contribution of the learned random walk is important for attributing performance gains. In the revised manuscript we will add a direct comparison against a fixed-affinity random-walk baseline that uses the same propagation architecture but with non-learned affinities. On the subset of our clinical datasets that include held-out dense annotations, we will also report overlap metrics (Dice and IoU) between the propagated probabilistic maps and these dense labels to demonstrate the fidelity of the supervision signal. revision: yes
Referee: [Results] The abstract and results claim superior uncertainty calibration, but no quantitative calibration metrics (e.g., expected calibration error or reliability diagrams) or comparison against the baselines are referenced. This omission makes it impossible to verify whether the uncertainty-weighted loss and learned propagation actually improve calibration or merely correlate with the reported accuracy gains.

Authors: We acknowledge that quantitative calibration evidence is needed to support the claim. In the revision we will add expected calibration error (ECE) values and reliability diagrams for VesselRW and all baselines. These metrics will be computed on the test sets and will allow direct verification of whether the uncertainty-weighted loss and joint training improve calibration beyond the observed accuracy improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity; propagation model provides independent image-driven signal

full rationale

The framework expands sparse annotations via a differentiable random walk that incorporates vesselness cues and tubular continuity priors directly from the image data. This propagation produces per-pixel hitting probabilities used as supervision for the CNN, with joint training allowing the affinities to adapt to the data rather than being fixed or self-referential. Experiments explicitly compare against naive sparse-label training and traditional dense pseudo-labeling baselines on clinical datasets, with reported gains in accuracy and uncertainty calibration. No equations or steps in the abstract reduce the output supervision to a fit of the target labels by construction, and no self-citation chain is invoked to justify uniqueness or force the result. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the premise that image-derived vesselness and continuity cues are sufficient to turn sparse marks into reliable dense supervision when the propagation and segmentation networks are trained together; no explicit free parameters or invented entities are named in the abstract.

axioms (2)

domain assumption Sparse annotations such as centerlines or scribbles can be expanded into dense per-pixel hitting probabilities by a random walk that follows vesselness cues and tubular continuity priors present in the image data.
This is the core mechanism stated in the abstract for generating the probabilistic supervision used in the loss.
domain assumption Joint training of the propagation model with the CNN allows learning of vessel boundaries and continuity without explicit edge supervision.
Abstract claims this joint optimization removes the need for additional edge labels.

pith-pipeline@v0.9.0 · 5800 in / 1550 out tokens · 45660 ms · 2026-05-19T00:33:51.062902+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We define Py(x)|ŷ,M via random-walk hitting probabilities on the pixel grid... transition weights T(x→x′) := 1/Zx exp(−Bϕ,I(x)) exp(−λΔorient(x,x′)) (1−μ(1−Vψ,I(x)))
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The core training objective is min θ,ϕ,ψ Σ w(x) DKL(Py(x)|ŷ,Mϕ,ψ,I ∥ Qθ,I(x)) + βH(Py) + γRtopo(P)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

161 extracted references · 161 canonical work pages · 2 internal anchors

[1]

Arxiv article (2023)

Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., Marlet, R.: Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. Arxiv article (2023)

work page 2023
[2]

Bazi, Y., Bashmal, L., et.al: Vision transformers for remote sensing image classification (2021)

work page 2021
[3]

Arxiv article (2024) 11

Benigmim, Y., Roy, S., Essid, S., Kalogeiton, V., Lathuili` ere, S.: Collaborating foundation models for domain generalized semantic segmentation. Arxiv article (2024) 11

work page 2024
[4]

In: European Conference on Computer Vision (ECCV) (2020)

Cha, J., Chun, S., Lee, G., Lee, B., Kim, S., Lee, H.: Few-shot compositional font gen- eration with dual memory. In: European Conference on Computer Vision (ECCV) (2020)

work page 2020
[5]

In: Arxiv Article (2021)

Choromanski, K., Likhosherstov, V., et.al: Rethinking attention with performers. In: Arxiv Article (2021)

work page 2021
[6]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open- world semantic segmentation from only image-text pairs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

work page 2023
[7]

Arxiv article (2023)

Cha, J., Mun, J., Roh, B.: Learning to generate text-grounded mask for open-world semantic segmentation from only image-text pairs. Arxiv article (2023)

work page 2023
[8]

Advances in Neural Information Processing Systems34, 7306–7318 (2021)

Chen, J., Niu, L., Liu, L., Zhang, L.: Weak-shot fine-grained classification via similarity transfer. Advances in Neural Information Processing Systems34, 7306–7318 (2021)

work page 2021
[9]

In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

Chen, J., Niu, L., Zhang, J., Si, J., Qian, C., Zhang, L.: Amodal instance segmentation via prior-guided expansion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 313–321 (2023)

work page 2023
[10]

Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

Chen, J., Niu, L., Zhou, S., Si, J., Qian, C., Zhang, L.: Weak-shot semantic segmenta- tion via dual similarity transfer. Advances in Neural Information Processing Systems 35, 32525–32536 (2023)

work page 2023
[11]

arXiv preprint (2022) arXiv:2203.11068

Cun, X., Wang, Z., et.al: Learning enriched illuminants for cross and single sensor color constancy. Arxiv preprint (2022) Arxiv:2203.11068

work page arXiv 2022
[12]

Ding, L., Lin, D., et.al: Looking outside the window: Wide-context transformer for the semantic segmentation of high-resolution remote sensing images (2022)

work page 2022
[13]

Arxiv article (2025)

Du, J., Liu, Y., et.al: Dependeval: Benchmarking llms for repository dependency understanding. Arxiv article (2025)

work page 2025
[14]

Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure

Fan, D.P., Zhang, S.C., et.al: Face sketch synthesis style similarity: A new structure co-occurrence texture measure. Arxiv preprint (2018) Arxiv:1804.02975

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

In: Arxiv Article (2022)

Guan, T., Wang, J., et.al: M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. In: Arxiv Article (2022)

work page 2022
[16]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

Ghorbanzadeh, O., Xu, Y., Zhao, H., Wang, J., Zhong, Y., Zhao, D., Zang, Q., et al.: The outcome of the 2022 landslide4sense competition: Advanced landslide detection from multisource satellite imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2022)

work page 2022
[17]

Huang, Z., Ben, Y., et.al: Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer (2021) 12

work page 2021
[18]

Arxiv article (2023)

He, H., Cai, J., Pan, Z., Liu, J., Zhang, J., Tao, D., Zhuang, B.: Dynamic focus-aware positional queries for semantic segmentation. Arxiv article (2023)

work page 2023
[19]

In: Proceedings of the ICCV, pp

He, H., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Sensitivity-aware visual parameter- efficient fine-tuning. In: Proceedings of the ICCV, pp. 11825–11835 (2023)

work page 2023
[20]

IEEE Transactions on Geoscience and Remote Sensing60, 1–15 (2022)

He, P., Jiao, L., Shang, R., Wang, S., Liu, X., Quan, D., Yang, K., Zhao, D.: Manet: Multi-scale aware-relation network for semantic segmentation in aerial scenes. IEEE Transactions on Geoscience and Remote Sensing60, 1–15 (2022)

work page 2022
[21]

TPAMI (2024)

He, H., Liu, J., Pan, Z., Cai, J., Zhang, J., Tao, D., Zhuang, B.: Pruning self-attentions into convolutional layers in single path. TPAMI (2024)

work page 2024
[22]

Computer Vision and Image Understanding224, 103556 (2022)

Huang, X., Wang, Y., Li, S., Mei, G., Xu, Z., Wang, Y., Zhang, J., Bennamoun, M.: Robust real-world point cloud registration by inlier detection. Computer Vision and Image Understanding224, 103556 (2022)

work page 2022
[23]

In: Proceedings of the AAAI (2020)

He, H., Zhang, J., Zhang, Q., Tao, D.: Grapy-ml: Graph pyramid mutual learning for cross-dataset human parsing. In: Proceedings of the AAAI (2020)

work page 2020
[24]

Arxiv article (2021)

Jia, Y., Kaul, C., Lawton, T., Murray-Smith, R., Habli, I.: Prediction of weaning from mechanical ventilation using convolutional neural networks. Arxiv article (2021)

work page 2021
[25]

IEEE Transactions on Image Processing30, 832–844 (2021)

Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y.: Layercam: Exploring hier- archical class activation maps. IEEE Transactions on Image Processing30, 832–844 (2021)

work page 2021
[26]

Arxiv article (2024)

Kim, C., Han, W., et.al: Eagle: Eigen aggregation learning for object-centric unsuper- vised semantic segmentation. Arxiv article (2024)

work page 2024
[27]

Arxiv preprint (2025)

Kim, D., Ko, H., et.al: Fourier decomposition for explicit representation of 3d point cloud attributes. Arxiv preprint (2025)

work page 2025
[28]

Arxiv article (2021)

Kaul, C., Mitton, J., et.al: Cpt: Convolutional point transformer for 3d point cloud processing. Arxiv article (2021)

work page 2021
[29]

In: Arxiv Article (2019)

Kaul, C., Manandhar, S., Pears, N.: Focusnet: An attention-based fully convolutional network for medical image segmentation. In: Arxiv Article (2019)

work page 2019
[30]

Arxiv article (2019)

Kaul, C., Pears, N., Manandhar, S.: Sawnet: A spatially aware deep neural network for 3d point cloud processing. Arxiv article (2019)

work page 2019
[31]

In: Arxiv Article (2021)

Kaul, C., Pears, N., Manandhar, S.: Fatnet: A feature-attentive network for 3d point cloud processing. In: Arxiv Article (2021)

work page 2021
[32]

Advances in Neural Information Processing Systems35, 30499–30511 (2022) 13

Kweon, H., Yoon, K.J.: Joint learning of 2d-3d weakly supervised semantic seg- mentation. Advances in Neural Information Processing Systems35, 30499–30511 (2022) 13

work page 2022
[33]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

Kweon, H., Yoon, K.J.: From sam to cams: Exploring segment anything model for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

work page 2024
[34]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

Kweon, H., Yoon, S.H., Kim, H., Park, D., Yoon, K.J.: Unlocking the potential of ordi- nary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

work page 2021
[35]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

work page 2021
[36]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

Kweon, H., Yoon, S.H., Yoon, K.J.: Weakly supervised semantic segmentation via adversarial learning of classifier and reconstructor. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

work page 2023
[37]

Arxiv article (2018)

Lu, Z., He, Q., et.al: Defect detection of pcb based on bayes feature fusion. Arxiv article (2018)

work page 2018
[38]

Arxiv article (2023)

Liu, X., Han, Z., Lee, S., Cao, Y.-P., Liu, Y.-S.: D-net: Learning for distinctive point clouds by self-attentive point searching and learnable feature fusion. Arxiv article (2023)

work page 2023
[39]

In: Arxiv Article (2019)

Liu, X., Han, Z., Lee, S., Cao, Y.-P.: Point2sequence: Learning the shape representa- tion of 3d point clouds with an attention-based sequence to sequence network. In: Arxiv Article (2019)

work page 2019
[40]

Arxiv article (2021)

Liu, X., Han, Z., Liu, Y.-S., Zwicker, M.: Fine-grained 3d shape classification with hierarchical part-view attention. Arxiv article (2021)

work page 2021
[41]

In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

Li, J., Jie, Z., Wang, X., Wei, X., Ma, L.: Expansion and shrinkage of localization for weakly-supervised semantic segmentation. In: Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS) (2022)

work page 2022
[42]

IEEE Transactions on Multimedia25, 1686–1699 (2022)

Li, J., Jie, Z., Wang, X., Zhou, Y., Wei, X., Ma, L.: Weakly supervised semantic segmentation via progressive patch learning. IEEE Transactions on Multimedia25, 1686–1699 (2022)

work page 2022
[43]

Neurocomputing561, 126821 (2023)

Li, J., Jie, Z., Wang, X., Zhou, Y., Ma, L., Jiang, J.: Weakly supervised semantic segmentation via self-supervised destruction learning. Neurocomputing561, 126821 (2023)

work page 2023
[44]

In: Arxiv Article (2023) 14

Liu, Q., Kaul, C., Wang, J., Anagnostopoulos, C., Murray-Smith, R., Deligianni, F.: Optimizing vision transformers for medical image segmentation. In: Arxiv Article (2023) 14

work page 2023
[45]

Arxiv preprint (2021)

Lu, Z., Liu, H., et.al: Efficient transformer for single image super-resolution. Arxiv preprint (2021)

work page 2021
[46]

In: Arxiv Article (2022)

Lin, L., Liu, Y., Hu, Y., Yan, X., Xie, K., Huang, H.: Capturing, reconstructing, and simulating: the urbanscene3d dataset. In: Arxiv Article (2022)

work page 2022
[47]

Arxiv article (2020)

Lu, D., Lu, X., Sun, Y., Wang, J.: Deep feature-preserving normal estimation for point cloud filtering. Arxiv article (2020)

work page 2020
[48]

In: Arxiv Preprint (2022)

Lee, S.H., Oh, G., et.al: Sound-guided semantic video generation. In: Arxiv Preprint (2022)

work page 2022
[49]

In: Proceedings of NeurIPS (2022)

Liu, J., Pan, Z., He, H., Cai, J., Zhuang, B.: Ecoformer: Energy-saving attention with linear complexity. In: Proceedings of NeurIPS (2022)

work page 2022
[50]

In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

Lee, S.H., Roh, W., et.al: Sound-guided semantic image manipulation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

work page 2022
[51]

In: Proceedings of the 29th ACM International Conference on Multimedia, pp

Li, J., Wang, W., Chen, J., Niu, L., Si, J., Qian, C., Zhang, L.: Video semantic segmentation via sparse temporal transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 59–68 (2021)

work page 2021
[52]

Applied Intelligence53(18), 20753–20765 (2023)

Li, X., Wu, Y., Dai, S.: Semi-supervised medical imaging segmentation with soft pseudo-label fusion. Applied Intelligence53(18), 20753–20765 (2023)

work page 2023
[53]

In: Arxiv Preprint (2022)

Li, J., Wu, J., et.al: Partglee: A foundation model for recognizing and parsing any objects. In: Arxiv Preprint (2022)

work page 2022
[54]

Li, K., Wang, Y., et.al: Uniformer: Unifying convolution and self-attention for visual recognition (2022)

work page 2022
[55]

Machine Intelligence Research (2023)

Liu, Y., Wu, Y.H., et.al: Vision transformers with hierarchical attention. Machine Intelligence Research (2023)

work page 2023
[56]

In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

Liu, L., Wang, Z., Phan, M.H., Zhang, B., Ge, J., Liu, Y.: Bpkd: Boundary privileged knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2024)

work page 2024
[57]

Arxiv article (2022)

Lu, D., Xie, Q., et.al: 3dctn: 3d convolution-transformer network for point cloud classification. Arxiv article (2022)

work page 2022
[58]

Advances in Neural Information Processing Systems34, 3978–3990 (2021)

Liu, Y., Zhang, Z., Niu, L., Chen, J., Zhang, L.: Mixed supervised object detection by transferring mask prior and semantic similarity. Advances in Neural Information Processing Systems34, 3978–3990 (2021)

work page 2021
[59]

Arxiv article (2023) 15

Mukhoti, J., Lin, T.-Y., Poursaeed, O., Wang, R., Shah, A., Torr, P.H.S., Lim, S.-N.: Open vocabulary semantic segmentation with patch aligned contrastive learning. Arxiv article (2023) 15

work page 2023
[60]

In: Arxiv Article (2021)

Mommert, M., Scheibenreif, L., Hanna, J., Borth, D.: Power plant classification from remote imaging with deep learning. In: Arxiv Article (2021)

work page 2021
[61]

In: Proceedings of the AAAI Conference on Artificial Intelligence, vol

Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with localized style representations and factorization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 2393–2402 (2021)

work page 2021
[62]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Multiple heads are better than one: Few-shot font generation with multiple localized experts. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)

work page 2021
[63]

IEEE Transactions on Pattern Analysis and Machine Intelligence46(3), 1479–1495 (2023)

Park, S., Chun, S., Cha, J., Lee, B., Shim, H.: Few-shot font generation with weakly supervised localized representations. IEEE Transactions on Pattern Analysis and Machine Intelligence46(3), 1479–1495 (2023)

work page 2023
[64]

Arxiv article (2023)

Pang, J., Liu, W., et.al: Mcnet: Magnitude consistency network for domain adaptive object detection under inclement environments. Arxiv article (2023)

work page 2023
[65]

In: Proceedings of the 27th ACM International Conference on Multimedia, pp

Park, K., Woo, S., Kim, D., Cho, D., Kweon, I.S.: Preserving semantic and tempo- ral consistency for unpaired video-to-video translation. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 1248–1257 (2019)

work page 2019
[66]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmenta- tion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

work page 2022
[67]

Advances in Neural Information Processing Systems (NeurIPS)33, 10869–10880 (2020)

Park, K., Woo, S., Shin, I., Kweon, I.S.: Discover, hallucinate, and adapt: Open compound domain adaptation for semantic segmentation. Advances in Neural Information Processing Systems (NeurIPS)33, 10869–10880 (2020)

work page 2020
[68]

In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp

Palasek, P., Yang, H., Xu, Z., Hajimirza, N., Izquierdo, E., Patras, I.: A flexible cal- ibration method of multiple kinects for 3d human reconstruction. In: 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–4 (2015)

work page 2015
[69]

In: 2015 IEEE International Conference on Image Processing (ICIP), pp

Peng, Y.T., Zhao, X., Cosman, P.C.: Single underwater image enhancement using depth estimation based on blurriness. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4952–4956 (2015)

work page 2015
[70]

In: Proceedings of the AAAI (2022)

Pan, Z., Zhuang, B., He, H., Liu, J., Cai, J.: Less is more: Pay less attention in vision transformers. In: Proceedings of the AAAI (2022)

work page 2022
[71]

In: Proceedings of the ICCV (2021)

Pan, Z., Zhuang, B., Liu, J., He, H., Cai, J.: Scalable vision transformers with hierarchical pooling. In: Proceedings of the ICCV (2021)

work page 2021
[72]

In: Arxiv Article (2021) 16

Ranftl, R., Bochkovskiy, A., et.al: Vision transformers for dense prediction. In: Arxiv Article (2021) 16

work page 2021
[73]

Arxiv article (2023)

Riz, L., Saltori, C., Ricci, E., Poiesi, F.: Novel class discovery for 3d point cloud semantic segmentation. Arxiv article (2023)

work page 2023
[74]

Journal of Chemical Information and Modeling (2021)

Sacha, M., B laz, M., Byrski, P., Dabrowski-Tumanski, P., Chrominski, M., et al.: Molecule edit graph attention network: Modeling chemical reactions as sequences of graph edits. Journal of Chemical Information and Modeling (2021)

work page 2021
[75]

In: Arxiv Article (2021)

Strudel, R., Garcia, R., et.al: Segmenter: Transformer for semantic segmentation. In: Arxiv Article (2021)

work page 2021
[76]

In: Arxiv Article (2022)

Scheibenreif, L., Hanna, J., et.al: Self-supervised vision transformers for land-cover segmentation and classification. In: Arxiv Article (2022)

work page 2022
[77]

In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

Sacha, M., Jura, B., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Inter- pretability benchmark for evaluating spatial misalignment of prototypical parts explanations. In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

work page 2024
[78]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

Shin, I., Kim, D.J., Cho, J.W., Woo, S., Park, K., Kweon, I.S.: Labor: Labeling only if required for domain adaptive semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2021)

work page 2021
[79]

Arxiv article (2022)

Scheibenreif, L., Mommert, M., Borth, D.: Contrastive self-supervised data fusion for satellite imagery. Arxiv article (2022)

work page 2022
[80]

In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)

Sacha, M., Rymarczyk, D., Struski, L., Tabor, J., Zielinski, B.: Protoseg: Interpretable semantic segmentation with prototypical parts. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2023)

work page 2023

Showing first 80 references.