LoViF 2026 Challenge on Human-oriented Semantic Image Quality Assessment: Methods and Results
Pith reviewed 2026-05-10 15:46 UTC · model grok-4.3
The pith
The LoViF 2026 challenge introduces the SeIQA dataset to benchmark how humans perceive semantic information loss in degraded images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The challenge establishes a new benchmark for human-oriented semantic image quality assessment through the SeIQA dataset, where participating methods evaluate the loss of semantic content as judged by human perception and the top submissions achieve state-of-the-art performance on the test set.
What carries the argument
The SeIQA dataset of degraded-reference image pairs that supplies training and evaluation targets for scoring semantic information retention from the human perspective.
If this is right
- Researchers can now compare semantic quality assessment algorithms against a shared, publicly described test set instead of ad-hoc metrics.
- Image compression and restoration pipelines can incorporate semantic scores to prioritize retention of meaning over pixel fidelity.
- Methods developed for this task provide initial baselines that future work in semantic coding can build upon or surpass.
- The split structure allows direct measurement of generalization from training degradations to unseen test cases.
Where Pith is reading between the lines
- Semantic quality scores from this benchmark could be used as a training signal for generative models to reduce hallucinations of meaning.
- The approach may transfer to video or 3D content, creating evaluation standards for semantic fidelity in dynamic scenes.
- Downstream applications such as medical imaging or surveillance could adopt the metric to ensure critical information survives processing.
Load-bearing premise
The specific degradations and reference pairings in the SeIQA dataset accurately reflect how humans judge the loss of semantic meaning in images.
What would settle it
A controlled experiment in which new human raters assign semantic quality scores to the test images that diverge substantially from the ground-truth references used in the challenge would undermine the dataset's validity as a human-oriented benchmark.
Figures
read the original abstract
This paper reviews the LoViF 2026 Challenge on Human-oriented Semantic Image Quality Assessment. This challenge aims to raise a new direction, i.e., how to evaluate the loss of semantic information from the human perspective, intending to promote the development of some new directions, like semantic coding, processing, and semantic-oriented optimization, etc. Unlike existing datasets of quality assessment, we form a dataset of human-oriented semantic quality assessment, termed the SeIQA dataset. This dataset is divided into three parts for this competition: (i) training data: 510 pairs of degraded images and their corresponding ground truth references; (ii) validation data: 80 pairs of degraded images and their corresponding ground-truth references; (iii) testing data: 160 pairs of degraded images and their corresponding ground-truth references. The primary objective of this challenge is to establish a new and powerful benchmark for human-oriented semantic image quality assessment. There are a total of 58 teams registered in this competition, and 6 teams submitted valid solutions and fact sheets for the final testing phase. These submissions achieved state-of-the-art (SOTA) performance on the SeIQA dataset.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reviews the LoViF 2026 Challenge on Human-oriented Semantic Image Quality Assessment. It introduces the SeIQA dataset with splits of 510 training, 80 validation, and 160 testing pairs of degraded images and ground-truth references. The paper states that 58 teams registered, 6 submitted valid solutions, and these achieved SOTA performance, with the goal of establishing a new benchmark for evaluating semantic information loss from the human perspective to advance semantic coding and processing.
Significance. A well-validated benchmark focused on human-perceived semantic loss could meaningfully advance semantic-aware image processing, coding, and optimization research by shifting focus from traditional pixel-level IQA. The reported participation and SOTA submissions indicate community interest, but the absence of construction details and validation metrics currently limits the benchmark's demonstrated utility and reproducibility.
major comments (2)
- [Abstract and Dataset section] Abstract and Dataset section: The manuscript claims the SeIQA dataset is 'human-oriented' and provides only the split sizes (510/80/160 pairs), but supplies no protocol for degradation generation, selection criteria, semantic annotations, or subjective human testing to confirm alignment with human perception of semantic information loss. This is load-bearing for the central claim that the challenge establishes a 'powerful new benchmark' and that submissions achieved SOTA on it.
- [Results and Evaluation] Results and Evaluation: The statement that the six teams 'achieved state-of-the-art (SOTA) performance' is made without reporting the specific evaluation metrics, numerical scores, baseline comparisons, or statistical validation. This prevents assessment of whether the outcomes support the benchmark's effectiveness.
minor comments (2)
- [Results] Add a table or summary listing the six teams' methods, key innovations, and final scores to improve clarity and allow readers to understand the submitted solutions.
- [Introduction] The manuscript would benefit from explicit comparison to existing IQA datasets and metrics (e.g., LIVE, TID2013) to better contextualize the novelty of the human-oriented semantic focus.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript summarizing the LoViF 2026 Challenge. We address each major comment below and will revise the paper to improve detail and reproducibility.
read point-by-point responses
-
Referee: [Abstract and Dataset section] Abstract and Dataset section: The manuscript claims the SeIQA dataset is 'human-oriented' and provides only the split sizes (510/80/160 pairs), but supplies no protocol for degradation generation, selection criteria, semantic annotations, or subjective human testing to confirm alignment with human perception of semantic information loss. This is load-bearing for the central claim that the challenge establishes a 'powerful new benchmark' and that submissions achieved SOTA on it.
Authors: We agree that the current high-level description in the manuscript does not sufficiently detail the SeIQA dataset construction to fully substantiate its human-oriented design. In the revised manuscript, we will expand the Dataset section with a description of the degradation generation protocol, image selection criteria, semantic annotations employed, and a summary of any subjective human testing used to align with perceived semantic loss. Complete protocols and raw subjective data will be referenced to the challenge repository to support reproducibility while respecting manuscript length constraints. revision: yes
-
Referee: [Results and Evaluation] Results and Evaluation: The statement that the six teams 'achieved state-of-the-art (SOTA) performance' is made without reporting the specific evaluation metrics, numerical scores, baseline comparisons, or statistical validation. This prevents assessment of whether the outcomes support the benchmark's effectiveness.
Authors: We concur that concrete metrics and scores are required to validate the SOTA claim and benchmark utility. The revised manuscript will include a dedicated Results section or table reporting the evaluation metrics (e.g., semantic similarity or human-aligned IQA scores), numerical performance values for the six valid submissions, comparisons to relevant baselines, and any statistical validation performed. This addition will enable readers to assess the outcomes directly. revision: yes
Circularity Check
No circularity: challenge report contains no derivations or self-referential predictions
full rationale
The paper is a competition summary describing the SeIQA dataset splits (510/80/160 pairs) and reporting external team submissions. No equations, fitted parameters, predictions, or derivation chains exist that could reduce to inputs by construction. The benchmark claim depends on dataset design but is not self-definitional, fitted, or justified via load-bearing self-citation within the text. This is a standard non-circular reporting paper.
Axiom & Free-Parameter Ledger
invented entities (1)
-
SeIQA dataset
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Shuai Bai, Yuxuan Cai, Ruizhe Chen, Keqin Chen, Xionghui Chen, Zesen Cheng, Lianghao Deng, Wei Ding, Chang Gao, Chunjiang Ge, et al. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631, 2025. 3, 4, 5
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
Signature verification using a ”siamese” time delay neural network
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard S”ackinger, and Roopak Shah. Signature verification using a ”siamese” time delay neural network. InAdvances in Neural Information Processing Systems, 1993. 6
work page 1993
-
[3]
Xgboost: A scalable tree boosting system
Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–794, 2016. 4
work page 2016
-
[4]
LoViF 2026 challenge on real-world all-in-one im- age restoration: Methods and results
Xiang Chen, Hao Li, Jiangxin Dong, Jinshan Pan, Xin Li, et al. LoViF 2026 challenge on real-world all-in-one im- age restoration: Methods and results. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2
work page 2026
-
[5]
Ruoyu Feng, Yunpeng Qi, Jinming Liu, Yixin Gao, Xin Li, Xin Jin, and Zhibo Chen. Diff-icmh: Harmonizing ma- chine and human vision in image compression with gener- ative prior.arXiv preprint arXiv:2511.22549, 2025. 1
-
[6]
Jinjin Gu, Haoming Cai, Haoyu Chen, Xiaoxing Ye, Jimmy Ren, and Chao Dong. Pyiqa: A python toolbox for image quality assessment.https: // github.com / chaofengc/IQA-PyTorch, 2023. 6, 7
work page 2023
-
[7]
Fengbin Guan, Xin Li, Zihao Yu, Yiting Lu, and Zhibo Chen. Qmamba: On first exploration of vision mamba for image quality assessment.arXiv preprint arXiv:2406.09546, 2024. 1
-
[8]
Lora: Low-rank adaptation of large language models.ICLR, 1(2):3, 2022
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.ICLR, 1(2):3, 2022. 3
work page 2022
-
[9]
Vbench: Comprehensive bench- mark suite for video generative models
Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, et al. Vbench: Comprehensive bench- mark suite for video generative models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21807–21818, 2024. 1
work page 2024
-
[10]
Weiping Ji, Jinjian Wu, Guangming Shi, Wenfei Wan, and Xuemei Xie. Blind image quality assessment with semantic information.Journal of Visual Communication and Image Representation, 58:195–204, 2019. 1
work page 2019
-
[11]
Lightgbm: A highly efficient gradient boosting decision tree
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree. InAd- vances in Neural Information Processing Systems, 2017. 4
work page 2017
-
[12]
Sed: Semantic-aware discriminator for image super-resolution.arXiv preprint arXiv:2402.19387, 2024
Bingchen Li, Xin Li, Hanxin Zhu, Yeying Jin, Ruoyu Feng, Zhizheng Zhang, and Zhibo Chen. Sed: Semantic-aware discriminator for image super-resolution.arXiv preprint arXiv:2402.19387, 2024. 1
-
[13]
Chunyi Li, Zicheng Zhang, Haoning Wu, Wei Sun, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, and Weisi Lin. Agiqa-3k: An open database for ai-generated image quality assessment.IEEE Transactions on Circuits and Sys- tems for Video Technology, 2023. 1
work page 2023
-
[14]
Junnan Li, Dongxu Li, Caiming Xiong, and Steven C. H. Hoi. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In Proceedings of the 39th International Conference on Ma- chine Learning, pages 12888–12900, 2022. 6
work page 2022
-
[15]
Xin Li, Jun Shi, and Zhibo Chen. Task-driven semantic cod- ing via reinforcement learning.IEEE Transactions on Image Processing, 30:6307–6320, 2021. 1
work page 2021
-
[16]
Xin Li, Yiting Lu, and Zhibo Chen. Freqalign: Excavat- ing perception-oriented transferability for blind image qual- ity assessment from a frequency perspective.IEEE Transac- tions on Multimedia, 2023. 1
work page 2023
-
[17]
Ntire 2024 chal- lenge on short-form ugc video quality assessment: Methods and results
Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, et al. Ntire 2024 chal- lenge on short-form ugc video quality assessment: Methods and results. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. 2
work page 2024
-
[18]
Lira: Lifelong image restoration from unknown blended distortions
Jianzhao Liu, Jianxin Lin, Xin Li, Wei Zhou, Sen Liu, and Zhibo Chen. Lira: Lifelong image restoration from unknown blended distortions. InEuropean Conference on Computer Vision, pages 616–632. Springer, 2020. 1
work page 2020
-
[19]
Guo Lu, Xingtong Ge, Tianxiong Zhong, Qiang Hu, and Jing Geng. Preprocessing enhanced image compression for ma- chine vision.IEEE transactions on circuits and systems for video technology, 34(12):13556–13568, 2024. 1
work page 2024
-
[20]
Kvq: Kwai video quality assessment for short-form videos.CVPR, 2024
Yiting Lu, Xin Li, Yajing Pei, Kun Yuan, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, and Zhibo Chen. Kvq: Kwai video quality assessment for short-form videos.CVPR, 2024. 2
work page 2024
-
[21]
LoViF 2026 the first challenge on holistic quality assessment for 4d world model (physcore)
Wei Luo, Yiting Lu, Xin Li, Haoran Li, Fengbin Guan, Chen Gao, Xin Jin, Yong Li, Zhibo Chen, et al. LoViF 2026 the first challenge on holistic quality assessment for 4d world model (physcore). InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2
work page 2026
-
[22]
Maxime Oquab, Timoth ´ee Darcet, Theo Moutakanni, Huy V . V o, Marc Szafraniec, Vasil Khalidov, Pierre Fern ´andez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision. InTransactions on Machine Learning Research, 2024. 4, 6
work page 2024
-
[23]
Catboost: Un- biased boosting with categorical features
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr V orobev, Anna Veronika Dorogush, and Andrey Gulin. Catboost: Un- biased boosting with categorical features. InAdvances in Neural Information Processing Systems, 2018. 4, 5
work page 2018
-
[24]
LoViF 2026 the first challenge on weather removal in videos
Chenghao Qian, Xin Li, Yeying Jin, Shangquan Sun, et al. LoViF 2026 the first challenge on weather removal in videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2
work page 2026
-
[25]
Learning transferable visual models from natural language supervi- sion
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 4, 5, 6
work page 2021
-
[26]
Frank: A ranking method with fidelity loss
Ming-Feng Tsai, Tie-Yan Liu, Tao Qin, Hsin-Hsi Chen, and Wei-Ying Ma. Frank: A ranking method with fidelity loss. InProceedings of the 30th Annual International ACM SI- GIR Conference on Research and Development in Informa- tion Retrieval, pages 383–390, 2007. 3
work page 2007
-
[27]
Ex- ploring clip for assessing the look and feel of images
Jianyi Wang, Kelvin CK Chan, and Chen Change Loy. Ex- ploring clip for assessing the look and feel of images. InPro- ceedings of the AAAI conference on artificial intelligence, pages 2555–2563, 2023. 1
work page 2023
-
[28]
Jiarui Wang, Huiyu Duan, Jing Liu, Shi Chen, Xiongkuo Min, and Guangtao Zhai. Aigciqa2023: A large-scale image quality assessment database for ai generated images: from the perspectives of quality, authenticity and correspondence. InCAAI International Conference on Artificial Intelligence, pages 46–57. Springer, 2023. 1, 2
work page 2023
-
[29]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Si- moncelli. Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600–612, 2004. 5, 6
work page 2004
-
[30]
Chain- of-thought prompting elicits reasoning in large language models
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc Le, and Denny Zhou. Chain- of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Sys- tems, pages 24824–24837, 2022. 5
work page 2022
-
[31]
Zhen Xu, Sergio Escalera, Adrien Pav ˜ao, Magali Richard, Wei-Wei Tu, Quanming Yao, Huan Zhao, and Isabelle Guyon. Codabench: Flexible, easy-to-use, and reproducible meta-benchmark platform.Patterns, 3(7):100543, 2022. 2
work page 2022
-
[32]
Maniqa: Multi-dimension attention network for no-reference image quality assessment
Sidi Yang, Tianhe Wu, Shuwei Shi, Shanshan Lao, Yuan Gong, Mingdeng Cao, Jiahao Wang, and Yujiu Yang. Maniqa: Multi-dimension attention network for no-reference image quality assessment. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1191–1200, 2022. 1
work page 2022
-
[33]
Wenhan Yang, Haofeng Huang, Yueyu Hu, Ling-Yu Duan, and Jiaying Liu. Video coding for machines: Compact vi- sual representation compression for intelligent collaborative analytics.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 46(7):5174–5191, 2024. 1
work page 2024
-
[34]
Guangtao Zhai and Xiongkuo Min. Perceptual image quality assessment: a survey.Science China Information Sciences, 63(11):211301, 2020. 1
work page 2020
-
[35]
Jusheng Zhang, Qinhan Lyu, Sizhuo Ma, Sheng Cao, Jian Wang, Xin Li, Keze Wang, Yongsen Zheng, Jing Yang, et al. The 1st LoViF challenge on efficient vlm for multimodal cre- ative quality scoring: Methods and results. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2026. 2
work page 2026
-
[36]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InCVPR, pages 586–595,
-
[37]
Weixia Zhang, Kede Ma, Jia Yan, Dexiang Deng, and Zhou Wang. Blind image quality assessment using a deep bilinear convolutional neural network.IEEE Transactions on Cir- cuits and Systems for Video Technology, 30(1):36–47, 2020. 1
work page 2020
-
[38]
Weixia Zhang, Kede Ma, Guangtao Zhai, and Xiaokang Yang. Uncertainty-aware blind image quality assessment in the laboratory and wild.IEEE Transactions on Image Pro- cessing, 30:3474–3486, 2021. 3
work page 2021
-
[39]
Weixia Zhang, Dingquan Li, Chao Ma, Guangtao Zhai, Xi- aokang Yang, and Kede Ma. Continual learning for blind im- age quality assessment.IEEE Transactions on Pattern Anal- ysis and Machine Intelligence, 45(3):2864–2878, 2023
work page 2023
-
[40]
Weixia Zhang, Kede Ma, Guangtao Zhai, and Xiaokang Yang. Task-specific normalization for continual learning of blind image quality models.IEEE Transactions on Image Processing, 33:1898–1910, 2024. 3
work page 1910
-
[41]
Metaiqa: Deep meta-learning for no- reference image quality assessment
Hancheng Zhu, Leida Li, Jinjian Wu, Weisheng Dong, and Guangming Shi. Metaiqa: Deep meta-learning for no- reference image quality assessment. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14143–14152, 2020. 1
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.