Privacy-Preserving Semantic Segmentation from Ultra-Low-Resolution RGB Inputs

Juergen Gall; Maren Bennewitz; Olga Zatsarynna; Sicong Pan; Xuying Huang

arxiv: 2507.16034 · v2 · submitted 2025-07-21 · 💻 cs.RO · cs.CV

Privacy-Preserving Semantic Segmentation from Ultra-Low-Resolution RGB Inputs

Xuying Huang , Sicong Pan , Olga Zatsarynna , Juergen Gall , Maren Bennewitz This is my paper

Pith reviewed 2026-05-19 03:24 UTC · model grok-4.3

classification 💻 cs.RO cs.CV

keywords semantic segmentationprivacy preservationultra-low-resolutionjoint learningvisual degradationrobotic navigationRGB inputs

0 comments

The pith

A fully joint-learning framework mitigates optimization conflicts from visual degradation to enable semantic segmentation on ultra-low-resolution RGB inputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to establish that semantic segmentation remains feasible on ultra-low-resolution RGB images when a joint-learning process directly tackles the training conflicts created by severe visual loss. This would matter to a sympathetic reader because ultra-low-resolution capture suppresses private details at the sensor itself, avoiding the exposure risks of high-resolution cameras in homes, hospitals, or other sensitive spaces. If the framework succeeds, visual perception tasks can proceed without sacrificing the privacy benefits of degraded inputs. Experiments compare performance against baselines and demonstrate viability in a real robotic navigation scenario.

Core claim

The central claim is that a novel fully joint-learning framework mitigates the optimization conflicts exacerbated by visual degradation for ultra-low-resolution semantic segmentation, yielding higher accuracy than representative baselines, a favorable privacy-performance trade-off, and successful execution of a downstream robotic object-goal navigation task.

What carries the argument

The fully joint-learning framework, which integrates resolution handling and segmentation objectives to resolve conflicts that arise during training on degraded inputs.

Load-bearing premise

Severe visual degradation from ultra-low-resolution RGB inputs produces optimization conflicts that a joint-learning framework can resolve.

What would settle it

An experiment that trains the joint framework and separate baseline networks on the same ultra-low-resolution dataset and finds no measurable accuracy gain for the joint approach.

Figures

Figures reproduced from arXiv: 2507.16034 by Juergen Gall, Maren Bennewitz, Olga Zatsarynna, Sicong Pan, Xuying Huang.

**Figure 1.** Figure 1: Our key innovation lies in enabling object-goal navigation through improved semantic segmentation from ultra-low-resolution RGB [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 3.** Figure 3: Proposed segmentation-aware discriminator network archi [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Visual comparison of super-resolution RGB images and semantic segmentation maps. We compare the visualization results of [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Semantic segmentation results during navigation across two [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

RGB-based semantic segmentation has become a mainstream approach for visual perception and is widely applied in a variety of downstream tasks. However, existing methods typically rely on high-resolution RGB inputs, which may expose sensitive visual content in privacy-critical environments. Ultra-low-resolution RGB sensing suppresses sensitive information directly during image acquisition, making it an attractive privacy-preserving alternative. Nevertheless, recovering semantic segmentation from ultra-low-resolution RGB inputs remains highly challenging due to severe visual degradation. In this work, we introduce a novel fully joint-learning framework to mitigate the optimization conflicts exacerbated by visual degradation for ultra-low-resolution semantic segmentation. Experiments demonstrate that our method outperforms representative baselines in semantic segmentation performance and our ultra-low-resolution RGB input achieves a favorable trade-off between privacy preservation and semantic segmentation performance. We deploy our privacy-preserving semantic segmentation method in a real-world robotic object-goal navigation task, demonstrating successful downstream task execution even under severe visual degradation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a joint-learning setup for semantic segmentation on ultra-low-res RGB to keep privacy while supporting robot tasks, and it includes a real deployment, but the key claim about resolving specific optimization conflicts rests on indirect performance gains rather than direct measurements.

read the letter

The central contribution is a fully joint-learning framework meant to handle the extra difficulties that come with doing semantic segmentation on ultra-low-resolution RGB images. The idea is that dropping resolution early preserves privacy by removing fine details at capture time, but it also hurts the segmentation objective, and the authors position their joint approach as a way to ease those tensions. They report better segmentation numbers than some baselines and show the whole thing running on a robot for object-goal navigation under heavy degradation. That deployment is the strongest part of the work; it moves beyond simulation or toy metrics and shows the pipeline can actually drive a downstream task in the real world. For anyone working on perception in homes, hospitals, or other places where high-res cameras raise privacy issues, this is a concrete data point worth noting. The low-res input itself is a simple hardware-level choice that avoids the need for extra privacy filters or encryption steps after the fact. On the soft spots, the abstract and summary do not define the optimization conflicts in any measurable way, such as through gradient statistics or targeted ablations that isolate the joint-training component. Without those, it is hard to tell whether the reported gains come from the joint framework specifically or from other design choices like architecture tweaks or training schedule. The lack of listed baselines, datasets, and exact metrics in the high-level description also makes it difficult to judge how large or robust the improvements really are. This paper is aimed at robotics and computer-vision groups that need practical privacy-aware perception rather than pure theoretical advances in low-resolution vision. A reader already working on low-res or privacy-constrained systems would get the most out of the deployment results and the overall framing. It is solid enough on the application side to deserve a full referee process, even if the methods section will need closer scrutiny on how the conflicts were handled and measured.

Referee Report

2 major / 2 minor

Summary. The paper introduces a novel fully joint-learning framework for semantic segmentation from ultra-low-resolution RGB inputs, claiming that this approach mitigates optimization conflicts caused by severe visual degradation. It reports outperformance over representative baselines, a favorable privacy-performance trade-off, and successful deployment in a real-world robotic object-goal navigation task.

Significance. If the central claims hold with stronger evidence, the work could contribute to privacy-preserving perception in robotics by showing that ultra-low-resolution inputs can support downstream tasks without exposing sensitive visual details.

major comments (2)

[§3] §3 (Methods): The motivation centers on 'optimization conflicts exacerbated by visual degradation,' but these conflicts are not formally defined (no equations for gradient interference or multi-objective trade-offs) nor directly measured (e.g., via cosine similarity of gradients between privacy and segmentation objectives). Without such quantification or targeted ablations that disable joint components while holding other factors fixed, the claim that the fully joint-learning framework specifically mitigates them rests on indirect performance gains.
[§4] §4 (Experiments): The abstract and results claim outperformance and successful deployment, yet the manuscript lacks explicit details on baseline implementations, exact datasets and splits, quantitative privacy metrics (e.g., face detection rates or information leakage measures), and ablations isolating the joint-learning effect. This weakens the ability to attribute gains to conflict mitigation rather than general architectural choices.

minor comments (2)

[Abstract] Abstract: Specify the exact ultra-low resolutions tested (e.g., pixel dimensions) and the privacy evaluation protocol to make the trade-off claim more concrete.
[Figures/Tables] Notation and figures: Ensure consistent use of symbols for resolution levels and loss terms across text and diagrams; add error bars or statistical tests to performance tables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and constructive suggestions. We address each of the major comments in detail below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (Methods): The motivation centers on 'optimization conflicts exacerbated by visual degradation,' but these conflicts are not formally defined (no equations for gradient interference or multi-objective trade-offs) nor directly measured (e.g., via cosine similarity of gradients between privacy and segmentation objectives). Without such quantification or targeted ablations that disable joint components while holding other factors fixed, the claim that the fully joint-learning framework specifically mitigates them rests on indirect performance gains.

Authors: We appreciate this observation. The manuscript motivates the fully joint-learning framework by explaining that severe visual degradation in ultra-low-resolution inputs creates competing objectives: the privacy goal favors maximal information loss, while segmentation requires preserving semantic cues. This leads to optimization challenges in joint training. While we did not include explicit equations for gradient interference in the initial submission, we will revise Section 3 to formally define the problem as a multi-task optimization with segmentation loss and a privacy regularization term. We will also add an analysis of gradient similarities and targeted ablations that isolate the joint optimization by comparing to separate training of components. These changes will provide direct evidence for the mitigation of conflicts. revision: yes
Referee: [§4] §4 (Experiments): The abstract and results claim outperformance and successful deployment, yet the manuscript lacks explicit details on baseline implementations, exact datasets and splits, quantitative privacy metrics (e.g., face detection rates or information leakage measures), and ablations isolating the joint-learning effect. This weakens the ability to attribute gains to conflict mitigation rather than general architectural choices.

Authors: We agree that reproducibility and attribution of results require more detailed reporting. Although the manuscript provides an overview of the experimental setup, datasets, and baselines, we will expand the Experiments section to include: precise descriptions of baseline implementations and hyperparameters, exact dataset splits used, additional quantitative privacy metrics including face detection rates on the input images and measures of information leakage, and dedicated ablations that hold the architecture fixed while varying the joint-learning strategy. For the robotic deployment, we will elaborate on the task setup and success metrics. These revisions will better support the claims regarding the benefits of the proposed framework. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central claim rests on empirical framework and experiments, not self-referential derivation.

full rationale

The paper introduces a novel fully joint-learning framework motivated by visual degradation in ultra-low-resolution RGB inputs for semantic segmentation. No equations, derivations, or parameter-fitting steps are described in the abstract or summary that reduce any prediction or result to its own inputs by construction. The optimization-conflicts premise serves as motivation rather than a load-bearing self-definition or fitted input renamed as prediction. No self-citation chains, uniqueness theorems from prior author work, or ansatz smuggling are indicated. The result is presented as validated through performance comparisons and robotic deployment, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on the abstract alone, no explicit free parameters, axioms, or invented entities are stated. The approach likely relies on standard deep-learning training assumptions and existing segmentation architectures without introducing new postulated entities.

pith-pipeline@v0.9.0 · 5695 in / 1053 out tokens · 32957 ms · 2026-05-19T03:24:43.111177+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce a novel fully joint-learning framework... agglomerative feature extractor and a segmentation-aware discriminator... Lfea = L1 + Lcos, LD = LBCE... Ladv
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our method outperforms... on SUN RGB-D... real-world robotic object-goal navigation

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Designing Privacy-Preserving Visual Perception for Robot Navigation Based on User Privacy Preferences
cs.RO 2026-04 unverdicted novelty 5.0

User studies reveal preferences for visual abstractions and distance-dependent low-resolution capture, leading to a configurable privacy policy for robot navigation.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving,

G. K. Alshammari, A. Abubakar, N. M. Ahmed, and N. K. Al- shammari, “Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving,” arXiv preprint arXiv:2504.18939, 2025

work page arXiv 2025
[2]

Contour detection and hierarchical image segmentation,

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,”IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) , 2010

work page 2010
[3]

Using super-resolution for enhancing visual perception and segmentation performance in veterinary cytology,

J. Caputa, M. Wielgosz, D. Łukasik, P. Russek, J. Grzeszczyk, M. Kar- watowski, S. Mazurek, R. Fr ˛ aczek, A.´Smiech, E. Jamro et al., “Using super-resolution for enhancing visual perception and segmentation performance in veterinary cytology,” Journal of Life (Life) , 2024

work page 2024
[4]

Encoder-decoder with atrous separable convolution for semantic image segmentation,

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. of the Europ. Conf. on Computer Vision (ECCV), 2018

work page 2018
[5]

Balancing privacy rights and the production of high- quality satellite imagery,

M. M. Coffer, “Balancing privacy rights and the production of high- quality satellite imagery,” 2020

work page 2020
[6]

Se- mantically accurate super-resolution generative adversarial networks,

T. Frizza, D. G. Dansereau, N. M. Seresht, and M. Bewley, “Se- mantically accurate super-resolution generative adversarial networks,” Journal of Computer Vision and Image Understanding (CVIU) , vol. 221, 2022

work page 2022
[7]

Privacy risks of robot vision: A user study on image modalities and resolution,

X. Huang, S. Pan, and M. Bennewitz, “Privacy risks of robot vision: A user study on image modalities and resolution,” arXiv preprint arXiv:2505.07766, 2025

work page arXiv 2025
[8]

Privacy-preserving robot vision with anonymized faces by extreme low resolution,

M. U. Kim, H. Lee, H. J. Yang, and M. S. Ryoo, “Privacy-preserving robot vision with anonymized faces by extreme low resolution,” in Proc. of the IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), 2019

work page 2019
[9]

Dmsc-gan: A c-gan-based framework for super- resolution reconstruction of sar images,

Y . Kong and S. Liu, “Dmsc-gan: A c-gan-based framework for super- resolution reconstruction of sar images,” Remote Sensing , 2023

work page 2023
[10]

Photo-realistic sin- gle image super-resolution using a generative adversarial network,

C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al. , “Photo-realistic sin- gle image super-resolution using a generative adversarial network,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2017

work page 2017
[11]

Mosaic: Generating consistent, privacy-preserving scenes from multiple depth views in multi-room environments,

Z. Liu, H. Zhu, R. Chen, J. Francis, S. Hwang, J. Zhang, and J. Oh, “Mosaic: Generating consistent, privacy-preserving scenes from multiple depth views in multi-room environments,” arXiv preprint arXiv:2503.13816, 2025

work page arXiv 2025
[12]

Spectral Normalization for Generative Adversarial Networks

T. Miyato, T. Kataoka, M. Koyama, and Y . Yoshida, “Spectral normalization for generative adversarial networks,” arXiv preprint arXiv:1802.05957, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[13]

An end-to-end framework for low-resolution remote sensing semantic segmentation,

M. B. Pereira and J. A. dos Santos, “An end-to-end framework for low-resolution remote sensing semantic segmentation,” in IEEE Latin American GRSS & ISPRS Remote Sensing Conference , 2020

work page 2020
[14]

Segloc: Learning segmentation-based representations for privacy-preserving visual localization,

M. Pietrantoni, M. Humenberger, T. Sattler, and G. Csurka, “Segloc: Learning segmentation-based representations for privacy-preserving visual localization,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2023

work page 2023
[15]

Am-radio: Agglomerative vision foundation model reduce all domains into one,

M. Ranzinger, G. Heinrich, J. Kautz, and P. Molchanov, “Am-radio: Agglomerative vision foundation model reduce all domains into one,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2024

work page 2024
[16]

“i still need my privacy

D. Reinhardt, M. Khurana, and L. H. Acosta, ““i still need my privacy”: Exploring the level of comfort and privacy preferences of german-speaking older adults in the case of mobile assistant robots,” Journal of Pervasive and Mobile Computing (PMC) , vol. 74, 2021

work page 2021
[17]

Privacy in human-robot interaction: Survey and future work,

M. Rueben and W. D. Smart, “Privacy in human-robot interaction: Survey and future work,” Proc. of the Intl. Conf. on We robot , 2016

work page 2016
[18]

Indoor segmen- tation and support inference from rgbd images,

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmen- tation and support inference from rgbd images,” in Proc. of the Europ. Conf. on Computer Vision (ECCV) , 2012

work page 2012
[19]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556 , 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[20]

Sun rgb-d: A rgb-d scene understanding benchmark suite,

S. Song, S. P. Lichtenberg, and J. Xiao, “Sun rgb-d: A rgb-d scene understanding benchmark suite,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2015

work page 2015
[21]

A survey of object goal navigation,

J. Sun, J. Wu, Z. Ji, and Y .-K. Lai, “A survey of object goal navigation,” IEEE Trans. on Automation Science and Engineering (TASE), 2024

work page 2024
[22]

You only need adversarial supervision for semantic im- age synthesis,

V . Sushko, E. Schönfeld, D. Zhang, J. Gall, B. Schiele, and A. Khoreva, “You only need adversarial supervision for semantic im- age synthesis,” in Proc. of the Intl. Conf. on Learning Representations (ICLR), 2021

work page 2021
[23]

The need for inherently privacy-preserving vision in trustworthy autonomous systems,

A. K. Taras, N. Suenderhauf, P. Corke, and D. G. Dansereau, “The need for inherently privacy-preserving vision in trustworthy autonomous systems,” arXiv preprint arXiv:2303.16408 , 2023

work page arXiv 2023
[24]

Esrgan: Enhanced super-resolution generative ad- versarial networks,

X. Wang, K. Yu, S. Wu, J. Gu, Y . Liu, C. Dong, Y . Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative ad- versarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops , 2018

work page 2018
[25]

Privacy-preserving synthetic continual semantic segmentation for robotic surgery,

M. Xu, M. Islam, L. Bai, and H. Ren, “Privacy-preserving synthetic continual semantic segmentation for robotic surgery,” IEEE Trans. on medical imaging (TMI) , 2024

work page 2024
[26]

Vlfm: Vision- language frontier maps for zero-shot semantic navigation,

N. Yokoyama, S. Ha, D. Batra, J. Wang, and B. Bucher, “Vlfm: Vision- language frontier maps for zero-shot semantic navigation,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA) , 2024

work page 2024

[1] [1]

Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving,

G. K. Alshammari, A. Abubakar, N. M. Ahmed, and N. K. Al- shammari, “Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving,” arXiv preprint arXiv:2504.18939, 2025

work page arXiv 2025

[2] [2]

Contour detection and hierarchical image segmentation,

P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,”IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) , 2010

work page 2010

[3] [3]

Using super-resolution for enhancing visual perception and segmentation performance in veterinary cytology,

J. Caputa, M. Wielgosz, D. Łukasik, P. Russek, J. Grzeszczyk, M. Kar- watowski, S. Mazurek, R. Fr ˛ aczek, A.´Smiech, E. Jamro et al., “Using super-resolution for enhancing visual perception and segmentation performance in veterinary cytology,” Journal of Life (Life) , 2024

work page 2024

[4] [4]

Encoder-decoder with atrous separable convolution for semantic image segmentation,

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. of the Europ. Conf. on Computer Vision (ECCV), 2018

work page 2018

[5] [5]

Balancing privacy rights and the production of high- quality satellite imagery,

M. M. Coffer, “Balancing privacy rights and the production of high- quality satellite imagery,” 2020

work page 2020

[6] [6]

Se- mantically accurate super-resolution generative adversarial networks,

T. Frizza, D. G. Dansereau, N. M. Seresht, and M. Bewley, “Se- mantically accurate super-resolution generative adversarial networks,” Journal of Computer Vision and Image Understanding (CVIU) , vol. 221, 2022

work page 2022

[7] [7]

Privacy risks of robot vision: A user study on image modalities and resolution,

X. Huang, S. Pan, and M. Bennewitz, “Privacy risks of robot vision: A user study on image modalities and resolution,” arXiv preprint arXiv:2505.07766, 2025

work page arXiv 2025

[8] [8]

Privacy-preserving robot vision with anonymized faces by extreme low resolution,

M. U. Kim, H. Lee, H. J. Yang, and M. S. Ryoo, “Privacy-preserving robot vision with anonymized faces by extreme low resolution,” in Proc. of the IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), 2019

work page 2019

[9] [9]

Dmsc-gan: A c-gan-based framework for super- resolution reconstruction of sar images,

Y . Kong and S. Liu, “Dmsc-gan: A c-gan-based framework for super- resolution reconstruction of sar images,” Remote Sensing , 2023

work page 2023

[10] [10]

Photo-realistic sin- gle image super-resolution using a generative adversarial network,

C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang et al. , “Photo-realistic sin- gle image super-resolution using a generative adversarial network,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2017

work page 2017

[11] [11]

Mosaic: Generating consistent, privacy-preserving scenes from multiple depth views in multi-room environments,

Z. Liu, H. Zhu, R. Chen, J. Francis, S. Hwang, J. Zhang, and J. Oh, “Mosaic: Generating consistent, privacy-preserving scenes from multiple depth views in multi-room environments,” arXiv preprint arXiv:2503.13816, 2025

work page arXiv 2025

[12] [12]

Spectral Normalization for Generative Adversarial Networks

T. Miyato, T. Kataoka, M. Koyama, and Y . Yoshida, “Spectral normalization for generative adversarial networks,” arXiv preprint arXiv:1802.05957, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[13] [13]

An end-to-end framework for low-resolution remote sensing semantic segmentation,

M. B. Pereira and J. A. dos Santos, “An end-to-end framework for low-resolution remote sensing semantic segmentation,” in IEEE Latin American GRSS & ISPRS Remote Sensing Conference , 2020

work page 2020

[14] [14]

Segloc: Learning segmentation-based representations for privacy-preserving visual localization,

M. Pietrantoni, M. Humenberger, T. Sattler, and G. Csurka, “Segloc: Learning segmentation-based representations for privacy-preserving visual localization,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2023

work page 2023

[15] [15]

Am-radio: Agglomerative vision foundation model reduce all domains into one,

M. Ranzinger, G. Heinrich, J. Kautz, and P. Molchanov, “Am-radio: Agglomerative vision foundation model reduce all domains into one,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2024

work page 2024

[16] [16]

“i still need my privacy

D. Reinhardt, M. Khurana, and L. H. Acosta, ““i still need my privacy”: Exploring the level of comfort and privacy preferences of german-speaking older adults in the case of mobile assistant robots,” Journal of Pervasive and Mobile Computing (PMC) , vol. 74, 2021

work page 2021

[17] [17]

Privacy in human-robot interaction: Survey and future work,

M. Rueben and W. D. Smart, “Privacy in human-robot interaction: Survey and future work,” Proc. of the Intl. Conf. on We robot , 2016

work page 2016

[18] [18]

Indoor segmen- tation and support inference from rgbd images,

N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmen- tation and support inference from rgbd images,” in Proc. of the Europ. Conf. on Computer Vision (ECCV) , 2012

work page 2012

[19] [19]

Very Deep Convolutional Networks for Large-Scale Image Recognition

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556 , 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[20] [20]

Sun rgb-d: A rgb-d scene understanding benchmark suite,

S. Song, S. P. Lichtenberg, and J. Xiao, “Sun rgb-d: A rgb-d scene understanding benchmark suite,” in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) , 2015

work page 2015

[21] [21]

A survey of object goal navigation,

J. Sun, J. Wu, Z. Ji, and Y .-K. Lai, “A survey of object goal navigation,” IEEE Trans. on Automation Science and Engineering (TASE), 2024

work page 2024

[22] [22]

You only need adversarial supervision for semantic im- age synthesis,

V . Sushko, E. Schönfeld, D. Zhang, J. Gall, B. Schiele, and A. Khoreva, “You only need adversarial supervision for semantic im- age synthesis,” in Proc. of the Intl. Conf. on Learning Representations (ICLR), 2021

work page 2021

[23] [23]

The need for inherently privacy-preserving vision in trustworthy autonomous systems,

A. K. Taras, N. Suenderhauf, P. Corke, and D. G. Dansereau, “The need for inherently privacy-preserving vision in trustworthy autonomous systems,” arXiv preprint arXiv:2303.16408 , 2023

work page arXiv 2023

[24] [24]

Esrgan: Enhanced super-resolution generative ad- versarial networks,

X. Wang, K. Yu, S. Wu, J. Gu, Y . Liu, C. Dong, Y . Qiao, and C. Change Loy, “Esrgan: Enhanced super-resolution generative ad- versarial networks,” in Proceedings of the European conference on computer vision (ECCV) workshops , 2018

work page 2018

[25] [25]

Privacy-preserving synthetic continual semantic segmentation for robotic surgery,

M. Xu, M. Islam, L. Bai, and H. Ren, “Privacy-preserving synthetic continual semantic segmentation for robotic surgery,” IEEE Trans. on medical imaging (TMI) , 2024

work page 2024

[26] [26]

Vlfm: Vision- language frontier maps for zero-shot semantic navigation,

N. Yokoyama, S. Ha, D. Batra, J. Wang, and B. Bucher, “Vlfm: Vision- language frontier maps for zero-shot semantic navigation,” in Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA) , 2024

work page 2024