LFX: Towards Unified Light Field Dense Semantic Segmentation and Salient Object Detection
Pith reviewed 2026-05-23 01:13 UTC · model grok-4.3
The pith
LFX creates a representation-invariant space that lets one model handle any light field format for both semantic segmentation and salient object detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LFX establishes a representation-invariant feature modulation space, enabling adaptation to heterogeneous LF representations and diverse perception tasks through Field-of-Parallax Angular Subspace Modeling that assigns independent angular markers to each auxiliary view, with shared manifold subspace constraints and regularization losses enforcing globally consistent semantic modulation across views.
What carries the argument
Field-of-Parallax Angular Subspace Modeling (FoP-ASM) combined with shared manifold subspace constraints, which allows view-wise independent modeling while maintaining global consistency.
If this is right
- LFX achieves state-of-the-art performance on three LF benchmarks for both semantic segmentation and salient object detection.
- It outperforms representation-specific methods by up to 12% and 20% on salient object detection with MAE of 0.029/0.027.
- It reaches 84.37 mIoU for semantic segmentation.
- The framework works across distinct LF representations without specific tuning.
Where Pith is reading between the lines
- The approach could generalize to additional light field tasks such as depth estimation or refocusing.
- Adoption might reduce computational overhead in systems that currently maintain separate models for each representation.
- Future work could test the framework on light fields from different camera hardware not included in the current benchmarks.
Load-bearing premise
Shared manifold subspace constraints and regularization losses will enforce globally consistent semantic modulation across views for arbitrary light field representations and tasks without requiring representation-specific tuning.
What would settle it
Applying LFX to a previously unseen light field representation or task and observing whether it maintains performance without additional tuning or retraining.
Figures
read the original abstract
Light field cameras capture multi-view observations within a single exposure. However, existing studies are typically tailored to specific LF representations, leaving the field without a unified learning framework. To bridge this gap, we present LFX, the first unified framework for LF perception. LFX establishes a representation-invariant feature modulation space, enabling it to adapt to heterogeneous LF representations and diverse perception tasks. Specifically, we propose Field-of-Parallax Angular Subspace Modeling (FoP-ASM), which assigns an independent angular marker to each auxiliary view, enabling view-wise independent modeling. Meanwhile, shared manifold subspace constraints and regularization losses enforce globally consistent semantic modulation across views. Extensive evaluations across three LF benchmarks show that LFX achieves state-of-the-art results across distinct LF representations, outperforming representation-specific methods by up to 12% and 20% with 0.029/0.027 MAE for salient object detection, and achieving 84.37 mIoU for semantic segmentation. The source code will be made publicly available at https://github.com/FeiT-FeiTeng/LFX.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes LFX as the first unified framework for light field dense semantic segmentation and salient object detection. It introduces Field-of-Parallax Angular Subspace Modeling (FoP-ASM) that assigns independent angular markers to auxiliary views, combined with shared manifold subspace constraints and regularization losses to enforce globally consistent semantic modulation. The framework is claimed to be representation-invariant across heterogeneous LF formats (EPI, sub-aperture, focal stack) and tasks, achieving SOTA results on three benchmarks with gains of up to 12% and 20%, MAE of 0.029/0.027 for salient object detection, and 84.37 mIoU for semantic segmentation.
Significance. If the representation-invariance claim holds without implicit per-representation tuning, the work would offer a meaningful unification in light field perception, replacing multiple specialized pipelines with a single adaptable model. The reported quantitative gains on multiple benchmarks indicate potential practical utility if the experimental support is robust.
major comments (2)
- [Abstract] Abstract: The central claim that 'shared manifold subspace constraints and regularization losses enforce globally consistent semantic modulation across views' for arbitrary LF representations lacks any derivation or analysis demonstrating invariance to angular sampling density, view ordering, or disparity range. This property is load-bearing for the 'unified' and 'representation-invariant' assertions but is presented without supporting mathematics or proof.
- [Abstract] Abstract (performance claims): The reported SOTA results (up to 12%/20% gains, specific MAE and mIoU values) are given without reference to error bars, ablation studies on the manifold constraints, or full experimental protocol details, preventing verification that the gains stem from the proposed invariance mechanism rather than representation-specific tuning.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the abstract claims. We address each major comment below and will revise the manuscript to improve clarity and support for the representation-invariance assertions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that 'shared manifold subspace constraints and regularization losses enforce globally consistent semantic modulation across views' for arbitrary LF representations lacks any derivation or analysis demonstrating invariance to angular sampling density, view ordering, or disparity range. This property is load-bearing for the 'unified' and 'representation-invariant' assertions but is presented without supporting mathematics or proof.
Authors: The abstract is a concise summary; the mathematical formulation of FoP-ASM, including independent angular marker assignment per auxiliary view and the shared manifold subspace constraints with regularization losses, is detailed in Section 3 of the full manuscript. Experiments across heterogeneous representations (EPI, sub-aperture, focal stack) with varying angular densities in Section 4 provide empirical support for consistent semantic modulation. We agree the abstract would benefit from a brief reference to these elements and will add a short note on the invariance properties in the revised abstract while expanding the method discussion for greater rigor. revision: yes
-
Referee: [Abstract] Abstract (performance claims): The reported SOTA results (up to 12%/20% gains, specific MAE and mIoU values) are given without reference to error bars, ablation studies on the manifold constraints, or full experimental protocol details, preventing verification that the gains stem from the proposed invariance mechanism rather than representation-specific tuning.
Authors: Ablation studies isolating the manifold constraints and regularization losses appear in Section 4.2 and Table 3, with full experimental protocols in Section 4.1. Error bars from repeated runs are included in the supplementary material, and results are shown to hold across distinct LF formats without per-representation tuning. We will revise the abstract to cite these sections and highlight the ablation outcomes to facilitate verification that gains arise from the unified mechanism. revision: yes
Circularity Check
No circularity: LFX presents an independent construction without self-referential reductions
full rationale
The abstract and description introduce LFX, FoP-ASM, shared manifold subspace constraints, and regularization losses as a new framework for representation-invariant LF perception. No equations, performance claims, or derivations are shown to reduce by construction to fitted parameters, self-citations, or renamed inputs from the same data. The central claims of unification and SOTA results are presented as outcomes of the proposed architecture rather than tautological redefinitions. This matches the default expectation of a self-contained method with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Light field cameras capture multi-view observations within a single exposure.
invented entities (1)
-
Field-of-Parallax Angular Subspace Modeling (FoP-ASM)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bi- directional adapter for multimodal tracking
Bing Cao, Junliang Guo, Pengfei Zhu, and Qinghua Hu. Bi- directional adapter for multimodal tracking. In AAAI, 2024. 3
work page 2024
-
[2]
Fusion-embedding siamese network for light field salient object detection
Geng Chen, Huazhu Fu, Tao Zhou, Guobao Xiao, Keren Fu, Yong Xia, and Yanning Zhang. Fusion-embedding siamese network for light field salient object detection. IEEE Trans- actions on Multimedia, 2023. 3
work page 2023
-
[3]
Pixel-wise matching cost func- tion for robust light field depth estimation
Rongshan Chen, Hao Sheng, Da Yang, Sizhe Wang, Zheng- long Cui, and Ruixuan Cong. Pixel-wise matching cost func- tion for robust light field depth estimation. Expert Systems with Applications, 2024. 1
work page 2024
-
[4]
View-guided cost volume for light field arbitrary-view disparity estima- tion
Rongshan Chen, Hao Sheng, Da Yang, Sizhe Wang, Zhen- glong Cui, Ruixuan Cong, and Shuai Wang. View-guided cost volume for light field arbitrary-view disparity estima- tion. IEEE Transactions on Visualization and Computer Graphics, 2024. 1
work page 2024
-
[5]
Yilei Chen, Gongyang Li, Ping An, Zhi Liu, Xinpeng Huang, and Qiang Wu. Light field salient object detection with sparse views via complementary and discriminative interac- tion network. IEEE Transactions on Circuits and Systems for Video Technology, 2024. 3, 6, 8
work page 2024
-
[6]
Lightweight all-focused light field rendering
Tom ´as Chlubna, Tom ´as Milet, and Pavel Zemc ´ık. Lightweight all-focused light field rendering. Computer Vision and Image Understanding, 2024. 1
work page 2024
-
[7]
End-to-End se- mantic segmentation utilizing multi-scale baseline light field
Ruixuan Cong, Hao Sheng, Dazhi Yang, Da Yang, Rong- shan Chen, Sizhe Wang, and Zhenglong Cui. End-to-End se- mantic segmentation utilizing multi-scale baseline light field. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2024. 6
work page 2024
-
[8]
Multimodal perception integrating point cloud and light field for ship au- tonomous driving
Ruixuan Cong, Hao Sheng, Mingyuan Zhao, Dazhi Yang, Tun Wang, Rongshan Chen, and Jiahao Shen. Multimodal perception integrating point cloud and light field for ship au- tonomous driving. IEEE Transactions on Intelligent Trans- portation Systems, 2024. 1
work page 2024
-
[9]
Combining implicit-explicit view correlation for light field semantic segmentation
Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, Zhenglong Cui, and Hao Sheng. Combining implicit-explicit view correlation for light field semantic segmentation. In CVPR, 2023. 3, 6
work page 2023
-
[10]
Waveguide-based augmented reality displays: perspectives and challenges
Yuqiang Ding, Qian Yang, Yannanqi Li, Zhiyong Yang, Zhengyang Wang, Haowen Liang, and Shin-Tson Wu. Waveguide-based augmented reality displays: perspectives and challenges. eLight, 2023. 1
work page 2023
-
[11]
Du ´e˜nez-Guzm´an, Suzanne Sadedin, Jane X
Edgar A. Du ´e˜nez-Guzm´an, Suzanne Sadedin, Jane X. Wang, Kevin R. McKee, and Joel Z. Leibo. A social path to human-like artificial intelligence. Nature Machine Intelli- gence, 2023. 2
work page 2023
-
[12]
Searching for computer vi- sion north stars
Li Fei-Fei and Ranjay Krishna. Searching for computer vi- sion north stars. Daedalus, 2022. 2
work page 2022
-
[13]
Light field salient object detection: A review and benchmark
Keren Fu, Yao Jiang, Ge-Peng Ji, Tao Zhou, Qijun Zhao, and Deng-Ping Fan. Light field salient object detection: A review and benchmark. Computational Visual Media, 2022. 1
work page 2022
-
[14]
A thorough benchmark and a new model for light field saliency detec- tion
Wei Gao, Songlin Fan, Ge Li, and Weisi Lin. A thorough benchmark and a new model for light field saliency detec- tion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 1, 2, 5
work page 2023
-
[15]
Identity mappings in deep residual networks
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In ECCV,
-
[16]
Ultra- compact snapshot spectral light-field imaging
Xia Hua, Yujie Wang, Shuming Wang, Xiujuan Zou, You Zhou, Lin Li, Feng Yan, Xun Cao, Shumin Xiao, Din Ping Tsai, Jiecai Han, Zhenlin Wang, and Shining Zhu. Ultra- compact snapshot spectral light-field imaging. Nature Com- munications, 2022. 1
work page 2022
-
[17]
Prin- ciples of light field imaging: Briefly revisiting 25 years of research
Ivo Ihrke, John Restrepo, and Lois Mignard-Debise. Prin- ciples of light field imaging: Briefly revisiting 25 years of research. IEEE Signal Processing Magazine, 2016. 1
work page 2016
-
[18]
Chen Jia, Fan Shi, Xiufeng Liu, Xu Cheng, Zixuan Zhang, Meng Zhao, and Shengyong Chen. Prompt learning for light field semantic segmentation in the consumer-centric internet of intelligent computing things. IEEE Transactions on Con- sumer Electronics, 2024. 2
work page 2024
-
[19]
Geminifusion: Efficient pixel-wise multimodal fusion for vision transformer
Ding Jia, Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Chang Xu, and Xinghao Chen. Geminifusion: Efficient pixel-wise multimodal fusion for vision transformer. ICML,
-
[20]
Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. Vi- sual prompt tuning. In ECCV, 2022. 3
work page 2022
-
[21]
Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Deven- dra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024. 5
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[22]
Occlusion-aware bi-directional guided network for light field salient object detection
Dong Jing, Shuo Zhang, Runmin Cong, and Youfang Lin. Occlusion-aware bi-directional guided network for light field salient object detection. In MM, 2021. 6
work page 2021
-
[23]
Maple: Multi-modal prompt learning
Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, and Fahad Shahbaz Khan. Maple: Multi-modal prompt learning. In CVPR, 2023. 3
work page 2023
-
[24]
Feng Li, Hao Zhang, Huaizhe Xu, Shilong Liu, Lei Zhang, Lionel M. Ni, and Heung-Yeung Shum. Mask DINO: To- wards A unified transformer-based framework for object de- tection and segmentation. In CVPR, 2023. 4
work page 2023
-
[25]
LFTransNet: Light field salient object detection via a learnable weight descriptor
Zhengyi Liu, Qian He, Linbo Wang, Xianyong Fang, and Bin Tang. LFTransNet: Light field salient object detection via a learnable weight descriptor. IEEE Transactions on Circuits and Systems for Video Technology, 2023. 2, 3
work page 2023
-
[26]
Metasurface-enabled augmented reality display: a review
Zeyang Liu, Danyan Wang, Hao Gao, Moxin Li, Huixian Zhou, and Cheng Zhang. Metasurface-enabled augmented reality display: a review. Advanced Photonics, 2023. 1
work page 2023
-
[27]
LFSamba: Marry SAM with mamba for light field salient object detection
Zhengyi Liu, Longzhen Wang, Xianyong Fang, Zhengzheng Tu, and Linbo Wang. LFSamba: Marry SAM with mamba for light field salient object detection. IEEE Signal Process- ing Letters, 2024. 2
work page 2024
-
[28]
A metric for light field reconstruction, compression, and display quality evaluation
Xiongkuo Min, Jiantao Zhou, Guangtao Zhai, Patrick Le Callet, Xiaokang Yang, and Xinping Guan. A metric for light field reconstruction, compression, and display quality evaluation. IEEE Transactions on Image Processing, 2020. 1
work page 2020
-
[29]
Light field photography with a hand-held plenoptic camera
Ren Ng, Marc Levoy, Mathieu Br ´edif, Gene Duval, Mark Horowitz, and Pat Hanrahan. Light field photography with a hand-held plenoptic camera. PhD thesis, Stanford Univer- sity, 2005. 3
work page 2005
-
[30]
FlexMoE: Scaling large-scale sparse pre-trained model training via dy- namic device placement
Xiaonan Nie, Xupeng Miao, Zilong Wang, Zichao Yang, Ji- long Xue, Lingxiao Ma, Gang Cao, and Bin Cui. FlexMoE: Scaling large-scale sparse pre-trained model training via dy- namic device placement. Proceedings of the ACM on Man- agement of Data, 2023. 5
work page 2023
-
[31]
DUT-LFSaliency: Versatile dataset and light field-to-RGB saliency detection
Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, and Huchuan Lu. DUT-LFSaliency: Versatile dataset and light field-to-RGB saliency detection. arXiv preprint arXiv:2012.15124, 2020. 2, 5, 6
-
[32]
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, and Adam San- toro. Mixture-of-Depths: Dynamically allocating com- pute in transformer-based language models. arXiv preprint arXiv:2404.02258, 2024. 5
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[33]
UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes
Hao Sheng, Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, and Zhenglong Cui. UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2022. 2, 5
work page 2022
-
[34]
LFNAT 2023 challenge on light field depth estimation: Methods and results
Hao Sheng, Yebin Liu, Jingyi Yu, Gaochang Wu, Wei Xiong, Ruixuan Cong, Rongshan Chen, Longzhao Guo, Yanlin Xie, Shuo Zhang, et al. LFNAT 2023 challenge on light field depth estimation: Methods and results. In CVPRW, 2023. 2
work page 2023
-
[35]
Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rog ´erio Feris, David Harwath, James R. Glass, and Hilde Kuehne. Everything at once- multi-modal fusion transformer for video retrieval. InCVPR,
-
[36]
LF tracy: A unified single-pipeline approach for salient object detection in light field cameras
Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, and Kailun Yang. LF tracy: A unified single-pipeline approach for salient object detection in light field cameras. In ICPR, 2024. 3, 6
work page 2024
-
[37]
OAFuser: Towards omni-aperture fusion for light field semantic segmentation
Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, and Kailun Yang. OAFuser: Towards omni-aperture fusion for light field semantic segmentation. IEEE Transactions on Artificial Intelligence, 2023. 6, 8
work page 2023
-
[38]
Parallel light fields: A per- spective and a framework
Fei-Yue Wang and Yu Shen. Parallel light fields: A per- spective and a framework. IEEE/CAA Journal of Automatica Sinica, 2024. 1, 2
work page 2024
-
[39]
Light field depth estimation: A comprehensive survey from principles to future
Tun Wang, Hao Sheng, Rongshan Chen, Da Yang, Zheng- long Cui, Sizhe Wang, Ruixuan Cong, and Mingyuan Zhao. Light field depth estimation: A comprehensive survey from principles to future. High-Confidence Computing, 2023. 1
work page 2023
-
[40]
TENet: Accurate light-field salient object detection with a transformer embedding network
Xingzheng Wang, Songwei Chen, Guoyao Wei, and Jiehao Liu. TENet: Accurate light-field salient object detection with a transformer embedding network. Image and Vision Com- puting, 2023. 3
work page 2023
-
[41]
Deep multimodal fusion by channel exchanging
Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, and Junzhou Huang. Deep multimodal fusion by channel exchanging. In NeurIPS, 2020. 3
work page 2020
-
[42]
Occlusion-aware cost Con- structor for light field depth estimation
Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Wei An, and Yulan Guo. Occlusion-aware cost Con- structor for light field depth estimation. In CVPR, 2022. 3
work page 2022
-
[43]
Zhaoqiang Wang, Lanxin Zhu, Hao Zhang, Guo Li, Chengqiang Yi, Yi Li, Yicong Yang, Yichen Ding, Mei Zhen, Shangbang Gao, Tzung K. Hsiai, and Peng Fei. Real-time volumetric reconstruction of biological dynamics with light- field microscopy and deep learning. Nature Methods, 2021. 1
work page 2021
-
[44]
Light field image processing: An overview
Gaochang Wu, Belen Masia, Adrian Jarabo, Yuchen Zhang, Liangyong Wang, Qionghai Dai, Tianyou Chai, and Yebin Liu. Light field image processing: An overview. IEEE Jour- nal of Selected Topics in Signal Processing, 2017. 1
work page 2017
-
[45]
FDViT: Improve the hierarchical architecture of vision transformer
Yixing Xu, Chao Li, Dong Li, Xiao Sheng, Fan Jiang, Lu Tian, and Ashish Sirasao. FDViT: Improve the hierarchical architecture of vision transformer. In CVPR, 2023. 3
work page 2023
-
[46]
Da Yang, Zhenglong Cui, Hao Sheng, Rongshan Chen, Ruixuan Cong, Shuai Wang, and Zhang Xiong. An occlu- sion and noise-aware stereo framework based on light field imaging for robust disparity estimation. IEEE Transactions on Computers, 2023. 3
work page 2023
-
[47]
Jianwei Yang, Chunyuan Li, Xiyang Dai, and Jianfeng Gao. Focal modulation networks. In NeurIPS, 2022. 6
work page 2022
-
[48]
Optical trapping with structured light: a review
Yuanjie Yang, Yu-Xuan Ren, Mingzhou Chen, Yoshihiko Arita, and Carmelo Rosales-Guzm ´an. Optical trapping with structured light: a review. Advanced Photonics, 2021. 1
work page 2021
-
[49]
LFIENet: Light field image enhancement network by fusing exposures of LF-DSLR image pairs
Wuyang Ye, Tao Yan, Jiahui Gao, and Yang Yang. LFIENet: Light field image enhancement network by fusing exposures of LF-DSLR image pairs. IEEE Transactions on Computa- tional Imaging, 2023. 2
work page 2023
-
[50]
1% vs 100%: Parameter- efficient low rank adapter for dense predictions
Dongshuo Yin, Yiran Yang, Zhechao Wang, Hongfeng Yu, Kaiwen Wei, and Xian Sun. 1% vs 100%: Parameter- efficient low rank adapter for dense predictions. In CVPR,
-
[51]
Parallax- aware network for light field salient object detection
Bo Yuan, Yao Jiang, Keren Fu, and Qijun Zhao. Parallax- aware network for light field salient object detection. IEEE Signal Processing Letters, 2024. 3
work page 2024
-
[52]
Improving light field re- construction from limited focal stack using diffusion models
Hyung Sup Yun and Il Yong Chun. Improving light field re- construction from limited focal stack using diffusion models. In MLSP, 2024. 2
work page 2024
-
[53]
CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers
Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruip- ing Liu, and Rainer Stiefelhagen. CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers. IEEE Transactions on Intelligent Transportation Systems, 2023. 3, 4
work page 2023
-
[54]
Delivering arbitrary-modal semantic segmen- tation
Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, and Rainer Stiefelhagen. Delivering arbitrary-modal semantic segmen- tation. In CVPR, 2023. 1, 3, 4, 6
work page 2023
-
[55]
Ex- ploring spatial correlation for light field saliency detection: expansion from a single view
Miao Zhang, Shuang Xu, Yongri Piao, and Huchuan Lu. Ex- ploring spatial correlation for light field saliency detection: expansion from a single view. IEEE Transactions on Image Processing, 2022. 6
work page 2022
-
[56]
A multi-task collaborative net- work for light field salient object detection
Qiudan Zhang, Shiqi Wang, Xu Wang, Zhenhao Sun, Sam Kwong, and Jianmin Jiang. A multi-task collaborative net- work for light field salient object detection. IEEE Transac- tions on Circuits and Systems for Video Technology , 2020. 6
work page 2020
-
[57]
Wei Zhang, Wei Ke, and Hao Sheng. Light field image super- resolution via global-view information adaption and angular attention fusion. In ICONIP, 2023. 3
work page 2023
-
[58]
Light field super-resolution using complementary-view fea- ture attention
Wei Zhang, Wei Ke, Da Yang, Hao Sheng, and Zhang Xiong. Light field super-resolution using complementary-view fea- ture attention. Computational Visual Media, 2023. 3
work page 2023
-
[59]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In CVPR, 2017. 6
work page 2017
-
[60]
Xin Zheng, Zhengqu Li, Deyang Liu, Xiaofei Zhou, and Caifeng Shan. Spatial attention-guided light field salient ob- ject detection network with implicit neural representation. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2024. 3
work page 2024
-
[61]
Exchanging-based multimodal fusion with transformer
Renyu Zhu, Chengcheng Han, Yong Qian, Qiushi Sun, Xiang Li, Ming Gao, Xuezhi Cao, and Yunsen Xian. Exchanging-based multimodal fusion with transformer. arXiv preprint arXiv:2309.02190, 2023. 3
-
[62]
Perception-aware multi-sensor fu- sion for 3D LiDAR semantic segmentation
Zhuangwei Zhuang, Rong Li, Kui Jia, Qicheng Wang, Yuan- qing Li, and Mingkui Tan. Perception-aware multi-sensor fu- sion for 3D LiDAR semantic segmentation. In ICCV, 2021. 4
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.