LFX: Towards Unified Light Field Dense Semantic Segmentation and Salient Object Detection

Boyuan Zheng; Buyin Deng; Fei Teng; Hong Zheng; Jiaming Zhang; Kailun Yang; Kai Luo; Kunyu Peng; Lingxin Huang; Yaonan Wang

arxiv: 2503.00747 · v2 · pith:MGBBBEX5new · submitted 2025-03-02 · 💻 cs.CV · cs.RO· eess.IV

LFX: Towards Unified Light Field Dense Semantic Segmentation and Salient Object Detection

Fei Teng , Lingxin Huang , Buyin Deng , Kai Luo , Boyuan Zheng , Zheng Fang , Hong Zheng , Kunyu Peng

show 3 more authors

Jiaming Zhang Yaonan Wang Kailun Yang

This is my paper

Pith reviewed 2026-05-23 01:13 UTC · model grok-4.3

classification 💻 cs.CV cs.ROeess.IV

keywords light fieldsemantic segmentationsalient object detectionunified frameworkfeature modulationangular subspace

0 comments

The pith

LFX creates a representation-invariant space that lets one model handle any light field format for both semantic segmentation and salient object detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces LFX as the first unified framework for light field perception tasks. It aims to overcome the limitation of previous methods that require tailoring to specific light field representations. By establishing a representation-invariant feature modulation space using Field-of-Parallax Angular Subspace Modeling and shared manifold constraints, LFX adapts to heterogeneous inputs and multiple tasks. A sympathetic reader would care because this could simplify development of light field applications by eliminating the need for multiple specialized models.

Core claim

LFX establishes a representation-invariant feature modulation space, enabling adaptation to heterogeneous LF representations and diverse perception tasks through Field-of-Parallax Angular Subspace Modeling that assigns independent angular markers to each auxiliary view, with shared manifold subspace constraints and regularization losses enforcing globally consistent semantic modulation across views.

What carries the argument

Field-of-Parallax Angular Subspace Modeling (FoP-ASM) combined with shared manifold subspace constraints, which allows view-wise independent modeling while maintaining global consistency.

If this is right

LFX achieves state-of-the-art performance on three LF benchmarks for both semantic segmentation and salient object detection.
It outperforms representation-specific methods by up to 12% and 20% on salient object detection with MAE of 0.029/0.027.
It reaches 84.37 mIoU for semantic segmentation.
The framework works across distinct LF representations without specific tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could generalize to additional light field tasks such as depth estimation or refocusing.
Adoption might reduce computational overhead in systems that currently maintain separate models for each representation.
Future work could test the framework on light fields from different camera hardware not included in the current benchmarks.

Load-bearing premise

Shared manifold subspace constraints and regularization losses will enforce globally consistent semantic modulation across views for arbitrary light field representations and tasks without requiring representation-specific tuning.

What would settle it

Applying LFX to a previously unseen light field representation or task and observing whether it maintains performance without additional tuning or retraining.

Figures

Figures reproduced from arXiv: 2503.00747 by Boyuan Zheng, Buyin Deng, Fei Teng, Hong Zheng, Jiaming Zhang, Kailun Yang, Kai Luo, Kunyu Peng, Lingxin Huang, Yaonan Wang, Zheng Fang.

**Figure 1.** Figure 1: Comparison between different LF representations. Field of Parallax (FoP) distills the common features from three aspects: [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Three different feature adaptation strategies. When us [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: The overall framework of the proposed LFX model is illustrated, where “AD” denotes the angular adapter, “SS Head” represents [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: The visualization of results for three tasks. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Light field cameras capture multi-view observations within a single exposure. However, existing studies are typically tailored to specific LF representations, leaving the field without a unified learning framework. To bridge this gap, we present LFX, the first unified framework for LF perception. LFX establishes a representation-invariant feature modulation space, enabling it to adapt to heterogeneous LF representations and diverse perception tasks. Specifically, we propose Field-of-Parallax Angular Subspace Modeling (FoP-ASM), which assigns an independent angular marker to each auxiliary view, enabling view-wise independent modeling. Meanwhile, shared manifold subspace constraints and regularization losses enforce globally consistent semantic modulation across views. Extensive evaluations across three LF benchmarks show that LFX achieves state-of-the-art results across distinct LF representations, outperforming representation-specific methods by up to 12% and 20% with 0.029/0.027 MAE for salient object detection, and achieving 84.37 mIoU for semantic segmentation. The source code will be made publicly available at https://github.com/FeiT-FeiTeng/LFX.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LFX tries to unify light field segmentation and saliency with FoP-ASM but the invariance claim looks under-supported so far.

read the letter

LFX claims to be the first single framework that handles multiple light field representations for both dense semantic segmentation and salient object detection. It introduces FoP-ASM to give each auxiliary view its own angular marker and adds shared manifold subspace constraints plus regularization losses to keep semantic modulation consistent across views. The abstract reports clear gains over representation-specific baselines on three benchmarks, including up to 12-20% improvement, 0.029/0.027 MAE on saliency, and 84.37 mIoU on segmentation, with code promised for release. That addresses a real practical issue in the subfield where most work stays locked to one format like EPI or focal stacks. The approach is straightforward and the numbers are presented without obvious circularity. The soft spot is the central invariance claim. Nothing in the abstract derives or tests that the manifold constraints remain effective when angular sampling density, view ordering, or disparity range changes, or when switching between fundamentally different LF formats. The regularization could easily embed assumptions about a canonical grid, which would make the unified property depend on implicit tuning rather than follow from the method. Without ablations that isolate those components across varied representations, the performance edge might not generalize as stated. The experiments also lack visible error bars or full protocol details in the summary, so the quantitative support is still provisional. This paper is aimed at light-field computer vision researchers who want to reduce duplicated effort across tasks. It is coherent enough and tackles a worthwhile gap to deserve peer review, though the referee will need to press on the invariance evidence and experimental rigor.

Referee Report

2 major / 0 minor

Summary. The paper proposes LFX as the first unified framework for light field dense semantic segmentation and salient object detection. It introduces Field-of-Parallax Angular Subspace Modeling (FoP-ASM) that assigns independent angular markers to auxiliary views, combined with shared manifold subspace constraints and regularization losses to enforce globally consistent semantic modulation. The framework is claimed to be representation-invariant across heterogeneous LF formats (EPI, sub-aperture, focal stack) and tasks, achieving SOTA results on three benchmarks with gains of up to 12% and 20%, MAE of 0.029/0.027 for salient object detection, and 84.37 mIoU for semantic segmentation.

Significance. If the representation-invariance claim holds without implicit per-representation tuning, the work would offer a meaningful unification in light field perception, replacing multiple specialized pipelines with a single adaptable model. The reported quantitative gains on multiple benchmarks indicate potential practical utility if the experimental support is robust.

major comments (2)

[Abstract] Abstract: The central claim that 'shared manifold subspace constraints and regularization losses enforce globally consistent semantic modulation across views' for arbitrary LF representations lacks any derivation or analysis demonstrating invariance to angular sampling density, view ordering, or disparity range. This property is load-bearing for the 'unified' and 'representation-invariant' assertions but is presented without supporting mathematics or proof.
[Abstract] Abstract (performance claims): The reported SOTA results (up to 12%/20% gains, specific MAE and mIoU values) are given without reference to error bars, ablation studies on the manifold constraints, or full experimental protocol details, preventing verification that the gains stem from the proposed invariance mechanism rather than representation-specific tuning.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract claims. We address each major comment below and will revise the manuscript to improve clarity and support for the representation-invariance assertions.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'shared manifold subspace constraints and regularization losses enforce globally consistent semantic modulation across views' for arbitrary LF representations lacks any derivation or analysis demonstrating invariance to angular sampling density, view ordering, or disparity range. This property is load-bearing for the 'unified' and 'representation-invariant' assertions but is presented without supporting mathematics or proof.

Authors: The abstract is a concise summary; the mathematical formulation of FoP-ASM, including independent angular marker assignment per auxiliary view and the shared manifold subspace constraints with regularization losses, is detailed in Section 3 of the full manuscript. Experiments across heterogeneous representations (EPI, sub-aperture, focal stack) with varying angular densities in Section 4 provide empirical support for consistent semantic modulation. We agree the abstract would benefit from a brief reference to these elements and will add a short note on the invariance properties in the revised abstract while expanding the method discussion for greater rigor. revision: yes
Referee: [Abstract] Abstract (performance claims): The reported SOTA results (up to 12%/20% gains, specific MAE and mIoU values) are given without reference to error bars, ablation studies on the manifold constraints, or full experimental protocol details, preventing verification that the gains stem from the proposed invariance mechanism rather than representation-specific tuning.

Authors: Ablation studies isolating the manifold constraints and regularization losses appear in Section 4.2 and Table 3, with full experimental protocols in Section 4.1. Error bars from repeated runs are included in the supplementary material, and results are shown to hold across distinct LF formats without per-representation tuning. We will revise the abstract to cite these sections and highlight the ablation outcomes to facilitate verification that gains arise from the unified mechanism. revision: yes

Circularity Check

0 steps flagged

No circularity: LFX presents an independent construction without self-referential reductions

full rationale

The abstract and description introduce LFX, FoP-ASM, shared manifold subspace constraints, and regularization losses as a new framework for representation-invariant LF perception. No equations, performance claims, or derivations are shown to reduce by construction to fitted parameters, self-citations, or renamed inputs from the same data. The central claims of unification and SOTA results are presented as outcomes of the proposed architecture rather than tautological redefinitions. This matches the default expectation of a self-contained method with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on standard assumptions of deep learning for dense prediction plus the new modeling components; no explicit free parameters or invented physical entities are described.

axioms (1)

domain assumption Light field cameras capture multi-view observations within a single exposure.
Opening statement of the abstract used as background for the problem setting.

invented entities (1)

Field-of-Parallax Angular Subspace Modeling (FoP-ASM) no independent evidence
purpose: Assigns an independent angular marker to each auxiliary view for view-wise independent modeling while using shared manifold constraints.
Newly proposed technique introduced to enable the unified framework.

pith-pipeline@v0.9.0 · 5754 in / 1327 out tokens · 41630 ms · 2026-05-23T01:13:51.352917+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 2 internal anchors

[1]

Bi- directional adapter for multimodal tracking

Bing Cao, Junliang Guo, Pengfei Zhu, and Qinghua Hu. Bi- directional adapter for multimodal tracking. In AAAI, 2024. 3

work page 2024
[2]

Fusion-embedding siamese network for light field salient object detection

Geng Chen, Huazhu Fu, Tao Zhou, Guobao Xiao, Keren Fu, Yong Xia, and Yanning Zhang. Fusion-embedding siamese network for light field salient object detection. IEEE Trans- actions on Multimedia, 2023. 3

work page 2023
[3]

Pixel-wise matching cost func- tion for robust light field depth estimation

Rongshan Chen, Hao Sheng, Da Yang, Sizhe Wang, Zheng- long Cui, and Ruixuan Cong. Pixel-wise matching cost func- tion for robust light field depth estimation. Expert Systems with Applications, 2024. 1

work page 2024
[4]

View-guided cost volume for light field arbitrary-view disparity estima- tion

Rongshan Chen, Hao Sheng, Da Yang, Sizhe Wang, Zhen- glong Cui, Ruixuan Cong, and Shuai Wang. View-guided cost volume for light field arbitrary-view disparity estima- tion. IEEE Transactions on Visualization and Computer Graphics, 2024. 1

work page 2024
[5]

Light field salient object detection with sparse views via complementary and discriminative interac- tion network

Yilei Chen, Gongyang Li, Ping An, Zhi Liu, Xinpeng Huang, and Qiang Wu. Light field salient object detection with sparse views via complementary and discriminative interac- tion network. IEEE Transactions on Circuits and Systems for Video Technology, 2024. 3, 6, 8

work page 2024
[6]

Lightweight all-focused light field rendering

Tom ´as Chlubna, Tom ´as Milet, and Pavel Zemc ´ık. Lightweight all-focused light field rendering. Computer Vision and Image Understanding, 2024. 1

work page 2024
[7]

End-to-End se- mantic segmentation utilizing multi-scale baseline light field

Ruixuan Cong, Hao Sheng, Dazhi Yang, Da Yang, Rong- shan Chen, Sizhe Wang, and Zhenglong Cui. End-to-End se- mantic segmentation utilizing multi-scale baseline light field. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2024. 6

work page 2024
[8]

Multimodal perception integrating point cloud and light field for ship au- tonomous driving

Ruixuan Cong, Hao Sheng, Mingyuan Zhao, Dazhi Yang, Tun Wang, Rongshan Chen, and Jiahao Shen. Multimodal perception integrating point cloud and light field for ship au- tonomous driving. IEEE Transactions on Intelligent Trans- portation Systems, 2024. 1

work page 2024
[9]

Combining implicit-explicit view correlation for light field semantic segmentation

Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, Zhenglong Cui, and Hao Sheng. Combining implicit-explicit view correlation for light field semantic segmentation. In CVPR, 2023. 3, 6

work page 2023
[10]

Waveguide-based augmented reality displays: perspectives and challenges

Yuqiang Ding, Qian Yang, Yannanqi Li, Zhiyong Yang, Zhengyang Wang, Haowen Liang, and Shin-Tson Wu. Waveguide-based augmented reality displays: perspectives and challenges. eLight, 2023. 1

work page 2023
[11]

Du ´e˜nez-Guzm´an, Suzanne Sadedin, Jane X

Edgar A. Du ´e˜nez-Guzm´an, Suzanne Sadedin, Jane X. Wang, Kevin R. McKee, and Joel Z. Leibo. A social path to human-like artificial intelligence. Nature Machine Intelli- gence, 2023. 2

work page 2023
[12]

Searching for computer vi- sion north stars

Li Fei-Fei and Ranjay Krishna. Searching for computer vi- sion north stars. Daedalus, 2022. 2

work page 2022
[13]

Light field salient object detection: A review and benchmark

Keren Fu, Yao Jiang, Ge-Peng Ji, Tao Zhou, Qijun Zhao, and Deng-Ping Fan. Light field salient object detection: A review and benchmark. Computational Visual Media, 2022. 1

work page 2022
[14]

A thorough benchmark and a new model for light field saliency detec- tion

Wei Gao, Songlin Fan, Ge Li, and Weisi Lin. A thorough benchmark and a new model for light field saliency detec- tion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 1, 2, 5

work page 2023
[15]

Identity mappings in deep residual networks

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In ECCV,

work page
[16]

Ultra- compact snapshot spectral light-field imaging

Xia Hua, Yujie Wang, Shuming Wang, Xiujuan Zou, You Zhou, Lin Li, Feng Yan, Xun Cao, Shumin Xiao, Din Ping Tsai, Jiecai Han, Zhenlin Wang, and Shining Zhu. Ultra- compact snapshot spectral light-field imaging. Nature Com- munications, 2022. 1

work page 2022
[17]

Prin- ciples of light field imaging: Briefly revisiting 25 years of research

Ivo Ihrke, John Restrepo, and Lois Mignard-Debise. Prin- ciples of light field imaging: Briefly revisiting 25 years of research. IEEE Signal Processing Magazine, 2016. 1

work page 2016
[18]

Prompt learning for light field semantic segmentation in the consumer-centric internet of intelligent computing things

Chen Jia, Fan Shi, Xiufeng Liu, Xu Cheng, Zixuan Zhang, Meng Zhao, and Shengyong Chen. Prompt learning for light field semantic segmentation in the consumer-centric internet of intelligent computing things. IEEE Transactions on Con- sumer Electronics, 2024. 2

work page 2024
[19]

Geminifusion: Efficient pixel-wise multimodal fusion for vision transformer

Ding Jia, Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Chang Xu, and Xinghao Chen. Geminifusion: Efficient pixel-wise multimodal fusion for vision transformer. ICML,

work page
[20]

Vi- sual prompt tuning

Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. Vi- sual prompt tuning. In ECCV, 2022. 3

work page 2022
[21]

Mixtral of Experts

Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Deven- dra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024. 5

work page internal anchor Pith review Pith/arXiv arXiv 2024
[22]

Occlusion-aware bi-directional guided network for light field salient object detection

Dong Jing, Shuo Zhang, Runmin Cong, and Youfang Lin. Occlusion-aware bi-directional guided network for light field salient object detection. In MM, 2021. 6

work page 2021
[23]

Maple: Multi-modal prompt learning

Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, and Fahad Shahbaz Khan. Maple: Multi-modal prompt learning. In CVPR, 2023. 3

work page 2023
[24]

Ni, and Heung-Yeung Shum

Feng Li, Hao Zhang, Huaizhe Xu, Shilong Liu, Lei Zhang, Lionel M. Ni, and Heung-Yeung Shum. Mask DINO: To- wards A unified transformer-based framework for object de- tection and segmentation. In CVPR, 2023. 4

work page 2023
[25]

LFTransNet: Light field salient object detection via a learnable weight descriptor

Zhengyi Liu, Qian He, Linbo Wang, Xianyong Fang, and Bin Tang. LFTransNet: Light field salient object detection via a learnable weight descriptor. IEEE Transactions on Circuits and Systems for Video Technology, 2023. 2, 3

work page 2023
[26]

Metasurface-enabled augmented reality display: a review

Zeyang Liu, Danyan Wang, Hao Gao, Moxin Li, Huixian Zhou, and Cheng Zhang. Metasurface-enabled augmented reality display: a review. Advanced Photonics, 2023. 1

work page 2023
[27]

LFSamba: Marry SAM with mamba for light field salient object detection

Zhengyi Liu, Longzhen Wang, Xianyong Fang, Zhengzheng Tu, and Linbo Wang. LFSamba: Marry SAM with mamba for light field salient object detection. IEEE Signal Process- ing Letters, 2024. 2

work page 2024
[28]

A metric for light field reconstruction, compression, and display quality evaluation

Xiongkuo Min, Jiantao Zhou, Guangtao Zhai, Patrick Le Callet, Xiaokang Yang, and Xinping Guan. A metric for light field reconstruction, compression, and display quality evaluation. IEEE Transactions on Image Processing, 2020. 1

work page 2020
[29]

Light field photography with a hand-held plenoptic camera

Ren Ng, Marc Levoy, Mathieu Br ´edif, Gene Duval, Mark Horowitz, and Pat Hanrahan. Light field photography with a hand-held plenoptic camera. PhD thesis, Stanford Univer- sity, 2005. 3

work page 2005
[30]

FlexMoE: Scaling large-scale sparse pre-trained model training via dy- namic device placement

Xiaonan Nie, Xupeng Miao, Zilong Wang, Zichao Yang, Ji- long Xue, Lingxiao Ma, Gang Cao, and Bin Cui. FlexMoE: Scaling large-scale sparse pre-trained model training via dy- namic device placement. Proceedings of the ACM on Man- agement of Data, 2023. 5

work page 2023
[31]

DUT-LFSaliency: Versatile dataset and light field-to-RGB saliency detection

Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, and Huchuan Lu. DUT-LFSaliency: Versatile dataset and light field-to-RGB saliency detection. arXiv preprint arXiv:2012.15124, 2020. 2, 5, 6

work page arXiv 2012
[32]

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, and Adam San- toro. Mixture-of-Depths: Dynamically allocating com- pute in transformer-based language models. arXiv preprint arXiv:2404.02258, 2024. 5

work page internal anchor Pith review Pith/arXiv arXiv 2024
[33]

UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes

Hao Sheng, Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, and Zhenglong Cui. UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2022. 2, 5

work page 2022
[34]

LFNAT 2023 challenge on light field depth estimation: Methods and results

Hao Sheng, Yebin Liu, Jingyi Yu, Gaochang Wu, Wei Xiong, Ruixuan Cong, Rongshan Chen, Longzhao Guo, Yanlin Xie, Shuo Zhang, et al. LFNAT 2023 challenge on light field depth estimation: Methods and results. In CVPRW, 2023. 2

work page 2023
[35]

Glass, and Hilde Kuehne

Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rog ´erio Feris, David Harwath, James R. Glass, and Hilde Kuehne. Everything at once- multi-modal fusion transformer for video retrieval. InCVPR,

work page
[36]

LF tracy: A unified single-pipeline approach for salient object detection in light field cameras

Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, and Kailun Yang. LF tracy: A unified single-pipeline approach for salient object detection in light field cameras. In ICPR, 2024. 3, 6

work page 2024
[37]

OAFuser: Towards omni-aperture fusion for light field semantic segmentation

Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, and Kailun Yang. OAFuser: Towards omni-aperture fusion for light field semantic segmentation. IEEE Transactions on Artificial Intelligence, 2023. 6, 8

work page 2023
[38]

Parallel light fields: A per- spective and a framework

Fei-Yue Wang and Yu Shen. Parallel light fields: A per- spective and a framework. IEEE/CAA Journal of Automatica Sinica, 2024. 1, 2

work page 2024
[39]

Light field depth estimation: A comprehensive survey from principles to future

Tun Wang, Hao Sheng, Rongshan Chen, Da Yang, Zheng- long Cui, Sizhe Wang, Ruixuan Cong, and Mingyuan Zhao. Light field depth estimation: A comprehensive survey from principles to future. High-Confidence Computing, 2023. 1

work page 2023
[40]

TENet: Accurate light-field salient object detection with a transformer embedding network

Xingzheng Wang, Songwei Chen, Guoyao Wei, and Jiehao Liu. TENet: Accurate light-field salient object detection with a transformer embedding network. Image and Vision Com- puting, 2023. 3

work page 2023
[41]

Deep multimodal fusion by channel exchanging

Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, and Junzhou Huang. Deep multimodal fusion by channel exchanging. In NeurIPS, 2020. 3

work page 2020
[42]

Occlusion-aware cost Con- structor for light field depth estimation

Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Wei An, and Yulan Guo. Occlusion-aware cost Con- structor for light field depth estimation. In CVPR, 2022. 3

work page 2022
[43]

Hsiai, and Peng Fei

Zhaoqiang Wang, Lanxin Zhu, Hao Zhang, Guo Li, Chengqiang Yi, Yi Li, Yicong Yang, Yichen Ding, Mei Zhen, Shangbang Gao, Tzung K. Hsiai, and Peng Fei. Real-time volumetric reconstruction of biological dynamics with light- field microscopy and deep learning. Nature Methods, 2021. 1

work page 2021
[44]

Light field image processing: An overview

Gaochang Wu, Belen Masia, Adrian Jarabo, Yuchen Zhang, Liangyong Wang, Qionghai Dai, Tianyou Chai, and Yebin Liu. Light field image processing: An overview. IEEE Jour- nal of Selected Topics in Signal Processing, 2017. 1

work page 2017
[45]

FDViT: Improve the hierarchical architecture of vision transformer

Yixing Xu, Chao Li, Dong Li, Xiao Sheng, Fan Jiang, Lu Tian, and Ashish Sirasao. FDViT: Improve the hierarchical architecture of vision transformer. In CVPR, 2023. 3

work page 2023
[46]

An occlu- sion and noise-aware stereo framework based on light field imaging for robust disparity estimation

Da Yang, Zhenglong Cui, Hao Sheng, Rongshan Chen, Ruixuan Cong, Shuai Wang, and Zhang Xiong. An occlu- sion and noise-aware stereo framework based on light field imaging for robust disparity estimation. IEEE Transactions on Computers, 2023. 3

work page 2023
[47]

Focal modulation networks

Jianwei Yang, Chunyuan Li, Xiyang Dai, and Jianfeng Gao. Focal modulation networks. In NeurIPS, 2022. 6

work page 2022
[48]

Optical trapping with structured light: a review

Yuanjie Yang, Yu-Xuan Ren, Mingzhou Chen, Yoshihiko Arita, and Carmelo Rosales-Guzm ´an. Optical trapping with structured light: a review. Advanced Photonics, 2021. 1

work page 2021
[49]

LFIENet: Light field image enhancement network by fusing exposures of LF-DSLR image pairs

Wuyang Ye, Tao Yan, Jiahui Gao, and Yang Yang. LFIENet: Light field image enhancement network by fusing exposures of LF-DSLR image pairs. IEEE Transactions on Computa- tional Imaging, 2023. 2

work page 2023
[50]

1% vs 100%: Parameter- efficient low rank adapter for dense predictions

Dongshuo Yin, Yiran Yang, Zhechao Wang, Hongfeng Yu, Kaiwen Wei, and Xian Sun. 1% vs 100%: Parameter- efficient low rank adapter for dense predictions. In CVPR,

work page
[51]

Parallax- aware network for light field salient object detection

Bo Yuan, Yao Jiang, Keren Fu, and Qijun Zhao. Parallax- aware network for light field salient object detection. IEEE Signal Processing Letters, 2024. 3

work page 2024
[52]

Improving light field re- construction from limited focal stack using diffusion models

Hyung Sup Yun and Il Yong Chun. Improving light field re- construction from limited focal stack using diffusion models. In MLSP, 2024. 2

work page 2024
[53]

CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers

Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruip- ing Liu, and Rainer Stiefelhagen. CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers. IEEE Transactions on Intelligent Transportation Systems, 2023. 3, 4

work page 2023
[54]

Delivering arbitrary-modal semantic segmen- tation

Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, and Rainer Stiefelhagen. Delivering arbitrary-modal semantic segmen- tation. In CVPR, 2023. 1, 3, 4, 6

work page 2023
[55]

Ex- ploring spatial correlation for light field saliency detection: expansion from a single view

Miao Zhang, Shuang Xu, Yongri Piao, and Huchuan Lu. Ex- ploring spatial correlation for light field saliency detection: expansion from a single view. IEEE Transactions on Image Processing, 2022. 6

work page 2022
[56]

A multi-task collaborative net- work for light field salient object detection

Qiudan Zhang, Shiqi Wang, Xu Wang, Zhenhao Sun, Sam Kwong, and Jianmin Jiang. A multi-task collaborative net- work for light field salient object detection. IEEE Transac- tions on Circuits and Systems for Video Technology , 2020. 6

work page 2020
[57]

Light field image super- resolution via global-view information adaption and angular attention fusion

Wei Zhang, Wei Ke, and Hao Sheng. Light field image super- resolution via global-view information adaption and angular attention fusion. In ICONIP, 2023. 3

work page 2023
[58]

Light field super-resolution using complementary-view fea- ture attention

Wei Zhang, Wei Ke, Da Yang, Hao Sheng, and Zhang Xiong. Light field super-resolution using complementary-view fea- ture attention. Computational Visual Media, 2023. 3

work page 2023
[59]

Pyramid scene parsing network

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In CVPR, 2017. 6

work page 2017
[60]

Spatial attention-guided light field salient ob- ject detection network with implicit neural representation

Xin Zheng, Zhengqu Li, Deyang Liu, Xiaofei Zhou, and Caifeng Shan. Spatial attention-guided light field salient ob- ject detection network with implicit neural representation. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2024. 3

work page 2024
[61]

Exchanging-based multimodal fusion with transformer

Renyu Zhu, Chengcheng Han, Yong Qian, Qiushi Sun, Xiang Li, Ming Gao, Xuezhi Cao, and Yunsen Xian. Exchanging-based multimodal fusion with transformer. arXiv preprint arXiv:2309.02190, 2023. 3

work page arXiv 2023
[62]

Perception-aware multi-sensor fu- sion for 3D LiDAR semantic segmentation

Zhuangwei Zhuang, Rong Li, Kui Jia, Qicheng Wang, Yuan- qing Li, and Mingkui Tan. Perception-aware multi-sensor fu- sion for 3D LiDAR semantic segmentation. In ICCV, 2021. 4

work page 2021

[1] [1]

Bi- directional adapter for multimodal tracking

Bing Cao, Junliang Guo, Pengfei Zhu, and Qinghua Hu. Bi- directional adapter for multimodal tracking. In AAAI, 2024. 3

work page 2024

[2] [2]

Fusion-embedding siamese network for light field salient object detection

Geng Chen, Huazhu Fu, Tao Zhou, Guobao Xiao, Keren Fu, Yong Xia, and Yanning Zhang. Fusion-embedding siamese network for light field salient object detection. IEEE Trans- actions on Multimedia, 2023. 3

work page 2023

[3] [3]

Pixel-wise matching cost func- tion for robust light field depth estimation

Rongshan Chen, Hao Sheng, Da Yang, Sizhe Wang, Zheng- long Cui, and Ruixuan Cong. Pixel-wise matching cost func- tion for robust light field depth estimation. Expert Systems with Applications, 2024. 1

work page 2024

[4] [4]

View-guided cost volume for light field arbitrary-view disparity estima- tion

Rongshan Chen, Hao Sheng, Da Yang, Sizhe Wang, Zhen- glong Cui, Ruixuan Cong, and Shuai Wang. View-guided cost volume for light field arbitrary-view disparity estima- tion. IEEE Transactions on Visualization and Computer Graphics, 2024. 1

work page 2024

[5] [5]

Light field salient object detection with sparse views via complementary and discriminative interac- tion network

Yilei Chen, Gongyang Li, Ping An, Zhi Liu, Xinpeng Huang, and Qiang Wu. Light field salient object detection with sparse views via complementary and discriminative interac- tion network. IEEE Transactions on Circuits and Systems for Video Technology, 2024. 3, 6, 8

work page 2024

[6] [6]

Lightweight all-focused light field rendering

Tom ´as Chlubna, Tom ´as Milet, and Pavel Zemc ´ık. Lightweight all-focused light field rendering. Computer Vision and Image Understanding, 2024. 1

work page 2024

[7] [7]

End-to-End se- mantic segmentation utilizing multi-scale baseline light field

Ruixuan Cong, Hao Sheng, Dazhi Yang, Da Yang, Rong- shan Chen, Sizhe Wang, and Zhenglong Cui. End-to-End se- mantic segmentation utilizing multi-scale baseline light field. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2024. 6

work page 2024

[8] [8]

Multimodal perception integrating point cloud and light field for ship au- tonomous driving

Ruixuan Cong, Hao Sheng, Mingyuan Zhao, Dazhi Yang, Tun Wang, Rongshan Chen, and Jiahao Shen. Multimodal perception integrating point cloud and light field for ship au- tonomous driving. IEEE Transactions on Intelligent Trans- portation Systems, 2024. 1

work page 2024

[9] [9]

Combining implicit-explicit view correlation for light field semantic segmentation

Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, Zhenglong Cui, and Hao Sheng. Combining implicit-explicit view correlation for light field semantic segmentation. In CVPR, 2023. 3, 6

work page 2023

[10] [10]

Waveguide-based augmented reality displays: perspectives and challenges

Yuqiang Ding, Qian Yang, Yannanqi Li, Zhiyong Yang, Zhengyang Wang, Haowen Liang, and Shin-Tson Wu. Waveguide-based augmented reality displays: perspectives and challenges. eLight, 2023. 1

work page 2023

[11] [11]

Du ´e˜nez-Guzm´an, Suzanne Sadedin, Jane X

Edgar A. Du ´e˜nez-Guzm´an, Suzanne Sadedin, Jane X. Wang, Kevin R. McKee, and Joel Z. Leibo. A social path to human-like artificial intelligence. Nature Machine Intelli- gence, 2023. 2

work page 2023

[12] [12]

Searching for computer vi- sion north stars

Li Fei-Fei and Ranjay Krishna. Searching for computer vi- sion north stars. Daedalus, 2022. 2

work page 2022

[13] [13]

Light field salient object detection: A review and benchmark

Keren Fu, Yao Jiang, Ge-Peng Ji, Tao Zhou, Qijun Zhao, and Deng-Ping Fan. Light field salient object detection: A review and benchmark. Computational Visual Media, 2022. 1

work page 2022

[14] [14]

A thorough benchmark and a new model for light field saliency detec- tion

Wei Gao, Songlin Fan, Ge Li, and Weisi Lin. A thorough benchmark and a new model for light field saliency detec- tion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 1, 2, 5

work page 2023

[15] [15]

Identity mappings in deep residual networks

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In ECCV,

work page

[16] [16]

Ultra- compact snapshot spectral light-field imaging

Xia Hua, Yujie Wang, Shuming Wang, Xiujuan Zou, You Zhou, Lin Li, Feng Yan, Xun Cao, Shumin Xiao, Din Ping Tsai, Jiecai Han, Zhenlin Wang, and Shining Zhu. Ultra- compact snapshot spectral light-field imaging. Nature Com- munications, 2022. 1

work page 2022

[17] [17]

Prin- ciples of light field imaging: Briefly revisiting 25 years of research

Ivo Ihrke, John Restrepo, and Lois Mignard-Debise. Prin- ciples of light field imaging: Briefly revisiting 25 years of research. IEEE Signal Processing Magazine, 2016. 1

work page 2016

[18] [18]

Prompt learning for light field semantic segmentation in the consumer-centric internet of intelligent computing things

Chen Jia, Fan Shi, Xiufeng Liu, Xu Cheng, Zixuan Zhang, Meng Zhao, and Shengyong Chen. Prompt learning for light field semantic segmentation in the consumer-centric internet of intelligent computing things. IEEE Transactions on Con- sumer Electronics, 2024. 2

work page 2024

[19] [19]

Geminifusion: Efficient pixel-wise multimodal fusion for vision transformer

Ding Jia, Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Chang Xu, and Xinghao Chen. Geminifusion: Efficient pixel-wise multimodal fusion for vision transformer. ICML,

work page

[20] [20]

Vi- sual prompt tuning

Menglin Jia, Luming Tang, Bor-Chun Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, and Ser-Nam Lim. Vi- sual prompt tuning. In ECCV, 2022. 3

work page 2022

[21] [21]

Mixtral of Experts

Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Deven- dra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024. 5

work page internal anchor Pith review Pith/arXiv arXiv 2024

[22] [22]

Occlusion-aware bi-directional guided network for light field salient object detection

Dong Jing, Shuo Zhang, Runmin Cong, and Youfang Lin. Occlusion-aware bi-directional guided network for light field salient object detection. In MM, 2021. 6

work page 2021

[23] [23]

Maple: Multi-modal prompt learning

Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, and Fahad Shahbaz Khan. Maple: Multi-modal prompt learning. In CVPR, 2023. 3

work page 2023

[24] [24]

Ni, and Heung-Yeung Shum

Feng Li, Hao Zhang, Huaizhe Xu, Shilong Liu, Lei Zhang, Lionel M. Ni, and Heung-Yeung Shum. Mask DINO: To- wards A unified transformer-based framework for object de- tection and segmentation. In CVPR, 2023. 4

work page 2023

[25] [25]

LFTransNet: Light field salient object detection via a learnable weight descriptor

Zhengyi Liu, Qian He, Linbo Wang, Xianyong Fang, and Bin Tang. LFTransNet: Light field salient object detection via a learnable weight descriptor. IEEE Transactions on Circuits and Systems for Video Technology, 2023. 2, 3

work page 2023

[26] [26]

Metasurface-enabled augmented reality display: a review

Zeyang Liu, Danyan Wang, Hao Gao, Moxin Li, Huixian Zhou, and Cheng Zhang. Metasurface-enabled augmented reality display: a review. Advanced Photonics, 2023. 1

work page 2023

[27] [27]

LFSamba: Marry SAM with mamba for light field salient object detection

Zhengyi Liu, Longzhen Wang, Xianyong Fang, Zhengzheng Tu, and Linbo Wang. LFSamba: Marry SAM with mamba for light field salient object detection. IEEE Signal Process- ing Letters, 2024. 2

work page 2024

[28] [28]

A metric for light field reconstruction, compression, and display quality evaluation

Xiongkuo Min, Jiantao Zhou, Guangtao Zhai, Patrick Le Callet, Xiaokang Yang, and Xinping Guan. A metric for light field reconstruction, compression, and display quality evaluation. IEEE Transactions on Image Processing, 2020. 1

work page 2020

[29] [29]

Light field photography with a hand-held plenoptic camera

Ren Ng, Marc Levoy, Mathieu Br ´edif, Gene Duval, Mark Horowitz, and Pat Hanrahan. Light field photography with a hand-held plenoptic camera. PhD thesis, Stanford Univer- sity, 2005. 3

work page 2005

[30] [30]

FlexMoE: Scaling large-scale sparse pre-trained model training via dy- namic device placement

Xiaonan Nie, Xupeng Miao, Zilong Wang, Zichao Yang, Ji- long Xue, Lingxiao Ma, Gang Cao, and Bin Cui. FlexMoE: Scaling large-scale sparse pre-trained model training via dy- namic device placement. Proceedings of the ACM on Man- agement of Data, 2023. 5

work page 2023

[31] [31]

DUT-LFSaliency: Versatile dataset and light field-to-RGB saliency detection

Yongri Piao, Zhengkun Rong, Shuang Xu, Miao Zhang, and Huchuan Lu. DUT-LFSaliency: Versatile dataset and light field-to-RGB saliency detection. arXiv preprint arXiv:2012.15124, 2020. 2, 5, 6

work page arXiv 2012

[32] [32]

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, and Adam San- toro. Mixture-of-Depths: Dynamically allocating com- pute in transformer-based language models. arXiv preprint arXiv:2404.02258, 2024. 5

work page internal anchor Pith review Pith/arXiv arXiv 2024

[33] [33]

UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes

Hao Sheng, Ruixuan Cong, Da Yang, Rongshan Chen, Sizhe Wang, and Zhenglong Cui. UrbanLF: A comprehensive light field dataset for semantic segmentation of urban scenes. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2022. 2, 5

work page 2022

[34] [34]

LFNAT 2023 challenge on light field depth estimation: Methods and results

Hao Sheng, Yebin Liu, Jingyi Yu, Gaochang Wu, Wei Xiong, Ruixuan Cong, Rongshan Chen, Longzhao Guo, Yanlin Xie, Shuo Zhang, et al. LFNAT 2023 challenge on light field depth estimation: Methods and results. In CVPRW, 2023. 2

work page 2023

[35] [35]

Glass, and Hilde Kuehne

Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rog ´erio Feris, David Harwath, James R. Glass, and Hilde Kuehne. Everything at once- multi-modal fusion transformer for video retrieval. InCVPR,

work page

[36] [36]

LF tracy: A unified single-pipeline approach for salient object detection in light field cameras

Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, and Kailun Yang. LF tracy: A unified single-pipeline approach for salient object detection in light field cameras. In ICPR, 2024. 3, 6

work page 2024

[37] [37]

OAFuser: Towards omni-aperture fusion for light field semantic segmentation

Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, and Kailun Yang. OAFuser: Towards omni-aperture fusion for light field semantic segmentation. IEEE Transactions on Artificial Intelligence, 2023. 6, 8

work page 2023

[38] [38]

Parallel light fields: A per- spective and a framework

Fei-Yue Wang and Yu Shen. Parallel light fields: A per- spective and a framework. IEEE/CAA Journal of Automatica Sinica, 2024. 1, 2

work page 2024

[39] [39]

Light field depth estimation: A comprehensive survey from principles to future

Tun Wang, Hao Sheng, Rongshan Chen, Da Yang, Zheng- long Cui, Sizhe Wang, Ruixuan Cong, and Mingyuan Zhao. Light field depth estimation: A comprehensive survey from principles to future. High-Confidence Computing, 2023. 1

work page 2023

[40] [40]

TENet: Accurate light-field salient object detection with a transformer embedding network

Xingzheng Wang, Songwei Chen, Guoyao Wei, and Jiehao Liu. TENet: Accurate light-field salient object detection with a transformer embedding network. Image and Vision Com- puting, 2023. 3

work page 2023

[41] [41]

Deep multimodal fusion by channel exchanging

Yikai Wang, Wenbing Huang, Fuchun Sun, Tingyang Xu, Yu Rong, and Junzhou Huang. Deep multimodal fusion by channel exchanging. In NeurIPS, 2020. 3

work page 2020

[42] [42]

Occlusion-aware cost Con- structor for light field depth estimation

Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Wei An, and Yulan Guo. Occlusion-aware cost Con- structor for light field depth estimation. In CVPR, 2022. 3

work page 2022

[43] [43]

Hsiai, and Peng Fei

Zhaoqiang Wang, Lanxin Zhu, Hao Zhang, Guo Li, Chengqiang Yi, Yi Li, Yicong Yang, Yichen Ding, Mei Zhen, Shangbang Gao, Tzung K. Hsiai, and Peng Fei. Real-time volumetric reconstruction of biological dynamics with light- field microscopy and deep learning. Nature Methods, 2021. 1

work page 2021

[44] [44]

Light field image processing: An overview

Gaochang Wu, Belen Masia, Adrian Jarabo, Yuchen Zhang, Liangyong Wang, Qionghai Dai, Tianyou Chai, and Yebin Liu. Light field image processing: An overview. IEEE Jour- nal of Selected Topics in Signal Processing, 2017. 1

work page 2017

[45] [45]

FDViT: Improve the hierarchical architecture of vision transformer

Yixing Xu, Chao Li, Dong Li, Xiao Sheng, Fan Jiang, Lu Tian, and Ashish Sirasao. FDViT: Improve the hierarchical architecture of vision transformer. In CVPR, 2023. 3

work page 2023

[46] [46]

An occlu- sion and noise-aware stereo framework based on light field imaging for robust disparity estimation

Da Yang, Zhenglong Cui, Hao Sheng, Rongshan Chen, Ruixuan Cong, Shuai Wang, and Zhang Xiong. An occlu- sion and noise-aware stereo framework based on light field imaging for robust disparity estimation. IEEE Transactions on Computers, 2023. 3

work page 2023

[47] [47]

Focal modulation networks

Jianwei Yang, Chunyuan Li, Xiyang Dai, and Jianfeng Gao. Focal modulation networks. In NeurIPS, 2022. 6

work page 2022

[48] [48]

Optical trapping with structured light: a review

Yuanjie Yang, Yu-Xuan Ren, Mingzhou Chen, Yoshihiko Arita, and Carmelo Rosales-Guzm ´an. Optical trapping with structured light: a review. Advanced Photonics, 2021. 1

work page 2021

[49] [49]

LFIENet: Light field image enhancement network by fusing exposures of LF-DSLR image pairs

Wuyang Ye, Tao Yan, Jiahui Gao, and Yang Yang. LFIENet: Light field image enhancement network by fusing exposures of LF-DSLR image pairs. IEEE Transactions on Computa- tional Imaging, 2023. 2

work page 2023

[50] [50]

1% vs 100%: Parameter- efficient low rank adapter for dense predictions

Dongshuo Yin, Yiran Yang, Zhechao Wang, Hongfeng Yu, Kaiwen Wei, and Xian Sun. 1% vs 100%: Parameter- efficient low rank adapter for dense predictions. In CVPR,

work page

[51] [51]

Parallax- aware network for light field salient object detection

Bo Yuan, Yao Jiang, Keren Fu, and Qijun Zhao. Parallax- aware network for light field salient object detection. IEEE Signal Processing Letters, 2024. 3

work page 2024

[52] [52]

Improving light field re- construction from limited focal stack using diffusion models

Hyung Sup Yun and Il Yong Chun. Improving light field re- construction from limited focal stack using diffusion models. In MLSP, 2024. 2

work page 2024

[53] [53]

CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers

Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruip- ing Liu, and Rainer Stiefelhagen. CMX: Cross-modal fusion for RGB-X semantic segmentation with transformers. IEEE Transactions on Intelligent Transportation Systems, 2023. 3, 4

work page 2023

[54] [54]

Delivering arbitrary-modal semantic segmen- tation

Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, and Rainer Stiefelhagen. Delivering arbitrary-modal semantic segmen- tation. In CVPR, 2023. 1, 3, 4, 6

work page 2023

[55] [55]

Ex- ploring spatial correlation for light field saliency detection: expansion from a single view

Miao Zhang, Shuang Xu, Yongri Piao, and Huchuan Lu. Ex- ploring spatial correlation for light field saliency detection: expansion from a single view. IEEE Transactions on Image Processing, 2022. 6

work page 2022

[56] [56]

A multi-task collaborative net- work for light field salient object detection

Qiudan Zhang, Shiqi Wang, Xu Wang, Zhenhao Sun, Sam Kwong, and Jianmin Jiang. A multi-task collaborative net- work for light field salient object detection. IEEE Transac- tions on Circuits and Systems for Video Technology , 2020. 6

work page 2020

[57] [57]

Light field image super- resolution via global-view information adaption and angular attention fusion

Wei Zhang, Wei Ke, and Hao Sheng. Light field image super- resolution via global-view information adaption and angular attention fusion. In ICONIP, 2023. 3

work page 2023

[58] [58]

Light field super-resolution using complementary-view fea- ture attention

Wei Zhang, Wei Ke, Da Yang, Hao Sheng, and Zhang Xiong. Light field super-resolution using complementary-view fea- ture attention. Computational Visual Media, 2023. 3

work page 2023

[59] [59]

Pyramid scene parsing network

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In CVPR, 2017. 6

work page 2017

[60] [60]

Spatial attention-guided light field salient ob- ject detection network with implicit neural representation

Xin Zheng, Zhengqu Li, Deyang Liu, Xiaofei Zhou, and Caifeng Shan. Spatial attention-guided light field salient ob- ject detection network with implicit neural representation. IEEE Transactions on Circuits and Systems for Video Tech- nology, 2024. 3

work page 2024

[61] [61]

Exchanging-based multimodal fusion with transformer

Renyu Zhu, Chengcheng Han, Yong Qian, Qiushi Sun, Xiang Li, Ming Gao, Xuezhi Cao, and Yunsen Xian. Exchanging-based multimodal fusion with transformer. arXiv preprint arXiv:2309.02190, 2023. 3

work page arXiv 2023

[62] [62]

Perception-aware multi-sensor fu- sion for 3D LiDAR semantic segmentation

Zhuangwei Zhuang, Rong Li, Kui Jia, Qicheng Wang, Yuan- qing Li, and Mingkui Tan. Perception-aware multi-sensor fu- sion for 3D LiDAR semantic segmentation. In ICCV, 2021. 4

work page 2021