arxiv: 2604.03112 · v1 · submitted 2026-04-03 · 📡 eess.IV · cs.CV· cs.MM

Recognition: 2 theorem links

· Lean Theorem

ARIQA-3DS: A Stereoscopic Image Quality Assessment Dataset for Realistic Augmented Reality

Aymen Sekhri, Mohamed-Chaker Larabi, Seyed Ali Amirshahi

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:02 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.MM

keywords augmented realityimage quality assessmentstereoscopicdatasetvisual confusionsimulator sicknesshead-mounted displayomnidirectional

0 comments

The pith

ARIQA-3DS is the first large stereoscopic dataset of 1,200 AR viewports that fuses real omnidirectional scenes with controlled virtual foregrounds to measure quality perception.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates ARIQA-3DS to fill the gap in existing image quality datasets for augmented reality. It combines high-resolution stereoscopic 360-degree real-world captures with overlaid virtual elements that vary in transparency and receive specific degradations. A subjective experiment with 36 participants wearing a video see-through head-mounted display gathered quality ratings together with reports of simulator sickness symptoms. The study finds that quality judgments depend chiefly on the condition of the foreground layer and change with the chosen transparency. This setup supplies a public benchmark meant to support better models of how viewers experience the overlap of real and virtual content.

Core claim

ARIQA-3DS comprises 1,200 AR viewports created by fusing high-resolution stereoscopic omnidirectional captures of real-world scenes with diverse augmented foregrounds under controlled transparency and degradation conditions. Subjective testing with 36 participants on a video see-through HMD reveals that perceived quality is primarily driven by foreground degradations and modulated by transparency levels, while oculomotor and disorientation symptoms increase progressively but manageably during viewing.

What carries the argument

The ARIQA-3DS dataset, which fuses stereoscopic omnidirectional real backgrounds with augmented foregrounds under controlled transparency and degradation to study visual confusion between layers.

If this is right

Perceived quality depends mainly on degradations applied to the augmented foreground rather than the real background.
Transparency levels modulate how strongly those foreground degradations affect overall quality ratings.
Simulator sickness symptoms increase progressively yet remain manageable across the viewing sessions tested.
The public dataset supplies a benchmark for training and evaluating next-generation AR-specific quality assessment models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Developers could use the observed foreground dominance to prioritize artifact reduction in virtual overlays when designing AR rendering pipelines.
The dataset's controlled fusion approach might be extended to dynamic, user-controlled AR experiences to test whether movement introduces additional quality factors.
Including both quality ratings and sickness indicators suggests future AR models should jointly predict perceptual and physiological responses rather than quality alone.
Similar stereoscopic fusion methods could be applied to create comparable datasets for virtual reality or mixed-reality scenarios beyond the AR focus here.

Load-bearing premise

The controlled laboratory setup with 36 participants using a specific video see-through head-mounted display sufficiently captures the perceptual interplay between real and virtual layers that occurs in everyday AR use.

What would settle it

If quality ratings collected from users in uncontrolled real-world AR settings with different headsets show background degradations or other factors dominating perceived quality instead of foreground ones, the dataset's central findings would not hold.

Figures

Figures reproduced from arXiv: 2604.03112 by Aymen Sekhri, Mohamed-Chaker Larabi, Seyed Ali Amirshahi.

**Figure 2.** Figure 2: Plots of Spatial Information (SI) and Colorfulness (CF) for the sixty foreground objects (blue: Graphical, orange: [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Representative stereoscopic 360◦ background images captured with Insta360 Pro 1 and Insta360 Pro 2 cameras in indoor and outdoor scenes from Poitiers (France), and Gjøvik (Norway). The reference images are shown alongside distorted versions generated using two color saturation levels (C1, C2) and two HEVC compression levels (H1, H2) [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Virtual foreground objects included in ARIQA-3DS, [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Illustration of AR simulation within the VR envi [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Overview of the subjective experiment protocol. Each participant begins with observer screening and an initial VRSQ [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Stereoscopic views (left and right eye) captured from [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: MOS distributions for (a) all images, (b) [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 10.** Figure 10: Relationship between MOS and SOS. The fitted curve [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 9.** Figure 9: Distribution of MOS across background (BG) and [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 11.** Figure 11: The evolution of the percentage of significantly [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

read the original abstract

As Augmented Reality (AR) technologies advance towards immersive consumer adoption, the need for rigorous Quality of Experience (QoE) assessment becomes critical. However, existing datasets often lack ecological validity, relying on monocular viewing or simplified backgrounds that fail to capture the complex perceptual interplay, termed visual confusion, between real and virtual layers. To address this gap, we present ARIQA-3DS, the first large stereoscopic AR Image Quality Assessment dataset. Comprising 1,200 AR viewports, the dataset fuses high-resolution stereoscopic omnidirectional captures of real-world scenes with diverse augmented foregrounds under controlled transparency and degradation conditions. We conducted a comprehensive subjective study with 36 participants using a video see-through head-mounted display, collecting both quality ratings and simulator-sickness indicators. Our analysis reveals that perceived quality is primarily driven by foreground degradations and modulated by transparency levels, while oculomotor and disorientation symptoms show a progressive but manageable increase during viewing. ARIQA-3DS will be publicly released to serve as a comprehensive benchmark for developing next-generation AR quality assessment models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ARIQA-3DS delivers a new 1200-viewport stereoscopic AR dataset from a controlled HMD study, but its value as a realistic benchmark depends on an untested jump from lab conditions to actual use.

read the letter

This paper's main contribution is the ARIQA-3DS dataset. It combines high-resolution stereoscopic omnidirectional real scenes with varied augmented foregrounds at controlled transparency and degradation levels, then collects ratings from 36 participants in a video see-through HMD along with simulator sickness scores. The analysis shows foreground degradations as the dominant factor in perceived quality, with transparency as a modulator and manageable sickness buildup over time. Releasing the data publicly is the right step and directly addresses the gap in prior monocular or simplified AR datasets.

Referee Report

2 major / 2 minor

Summary. The paper presents ARIQA-3DS, the first large stereoscopic AR image quality assessment dataset comprising 1,200 AR viewports. These are generated by fusing high-resolution stereoscopic omnidirectional real-world scene captures with diverse augmented foregrounds under controlled transparency and degradation conditions. A subjective study with 36 participants using a video see-through HMD collected quality ratings and simulator-sickness indicators. Analysis indicates that perceived quality is driven primarily by foreground degradations modulated by transparency levels, with progressive but manageable increases in oculomotor and disorientation symptoms.

Significance. If the collected ratings hold, the public release of ARIQA-3DS would fill a clear gap by providing stereoscopic data that explicitly models visual confusion between real and virtual layers, enabling development of next-generation AR QoE models. The inclusion of both quality scores and simulator-sickness measures adds practical value for HMD-based applications. The contribution is strengthened by the scale (1,200 viewports) and controlled variation in transparency and degradations, though its benchmark utility rests on the transferability of the lab protocol.

major comments (2)

[§3] §3 (Subjective Study Protocol): The video see-through HMD setup with static scenes and fixed viewing positions is presented as capturing realistic visual confusion, yet no validation against dynamic head motion, variable illumination, or optical see-through conditions is reported; this directly affects the ecological-validity claim that underpins the dataset's positioning as a realistic benchmark.
[§4] §4 (Data Analysis): The statement that perceived quality is 'primarily driven by foreground degradations and modulated by transparency levels' is not accompanied by reported statistical tests, effect sizes, or correlation values, leaving the strength and interaction of these factors unquantified.

minor comments (2)

[Abstract] Abstract: The number of viewports (1,200) and participants (36) should be stated explicitly to allow immediate assessment of scale.
[§2] §2 (Related Work): A concise table comparing ARIQA-3DS against prior monocular or non-stereoscopic AR IQA datasets would clarify the novelty in stereoscopy and transparency control.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive recommendation. We address each major comment below and have revised the manuscript accordingly to improve clarity and rigor.

read point-by-point responses

Referee: [§3] §3 (Subjective Study Protocol): The video see-through HMD setup with static scenes and fixed viewing positions is presented as capturing realistic visual confusion, yet no validation against dynamic head motion, variable illumination, or optical see-through conditions is reported; this directly affects the ecological-validity claim that underpins the dataset's positioning as a realistic benchmark.

Authors: We acknowledge the importance of ecological validity for the dataset's positioning. Our protocol deliberately employs static scenes and fixed viewing positions in a video see-through HMD to isolate and reproducibly measure the effects of visual confusion under controlled transparency and degradation conditions. Direct empirical validation against dynamic head motion, variable illumination, or optical see-through HMDs was not performed due to laboratory constraints. In the revised manuscript, we have expanded §3 to explicitly state these scope limitations, justify the controlled setup as a necessary first step for establishing baseline AR QoE data, and outline planned extensions to dynamic conditions in future work. revision: partial
Referee: [§4] §4 (Data Analysis): The statement that perceived quality is 'primarily driven by foreground degradations and modulated by transparency levels' is not accompanied by reported statistical tests, effect sizes, or correlation values, leaving the strength and interaction of these factors unquantified.

Authors: We agree that statistical quantification strengthens the analysis. In the revised §4, we now report the results of a two-way repeated-measures ANOVA on the quality scores, including F-statistics, p-values, and partial eta-squared effect sizes for the main effects of foreground degradation and transparency level, as well as their interaction. We also include Pearson correlation coefficients between mean opinion scores and the controlled parameters. These additions confirm that foreground degradations exert a large main effect (F(3,105)=52.3, p<0.001, η²_p=0.60) that is significantly modulated by transparency level (interaction F(6,210)=9.4, p<0.001). revision: yes

Circularity Check

0 steps flagged

No significant circularity; dataset paper with no derivations

full rationale

The paper is a data-collection and subjective-study contribution that describes the creation of the ARIQA-3DS dataset (1,200 stereoscopic AR viewports) and a 36-participant HMD study. No equations, fitted parameters, predictions, or derivation chains appear in the provided text. The central claim is the release of the dataset itself, which is externally verifiable and does not reduce to any self-referential input by construction. Self-citations, if present, are not load-bearing for any result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper rests on standard assumptions from perceptual psychology and image quality assessment rather than introducing new free parameters or invented entities.

axioms (1)

domain assumption Subjective ratings collected via video see-through HMD with 36 participants provide a valid measure of perceived AR quality and simulator sickness.
The study design relies on established subjective testing protocols being transferable to stereoscopic AR viewing conditions.

pith-pipeline@v0.9.0 · 5503 in / 1286 out tokens · 50660 ms · 2026-05-13T19:02:12.817034+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
Comprising 1,200 AR viewports, the dataset fuses high-resolution stereoscopic omnidirectional captures of real-world scenes with diverse augmented foregrounds under controlled transparency and degradation conditions.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
We conducted a comprehensive subjective study with 36 participants using a video see-through head-mounted display

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

[1]

Qualinet white paper on definitions of immersive media experience (imex),

A. Perkis, C. Timmerer, S. Barakovi ´c, J. B. Husi ´c, S. Bech, S. Bosse, J. Botev,et al., “Qualinet white paper on definitions of immersive media experience (imex),”arXiv preprint arXiv:2007.07032, 2020

work page arXiv 2007
[2]

M. S. Van Gisbergen,Contextual connected media: How rearranging a media puzzle, brings virtual reality into being. NHTV , 2016

work page 2016
[3]

(2021) ITU-T Recommendation G.1035. Int. Telecommunication Union. [Online]. Available: https://www.itu.int/rec/T-REC-G.1035-202111-I/en

work page 2021
[4]

A taxonomy of mixed reality visual displays,

P. Milgram and F. Kishino, “A taxonomy of mixed reality visual displays,”IEICE Transactions on Information and Systems, vol. 77, no. 12, pp. 1321–1329, 1994

work page 1994
[5]

History of augmented reality,

R. Vertucci, S. D’Onofrio, S. Ricciardi, and M. De Nino, “History of augmented reality,” inSpringer Handbook of Augmented Reality. Springer, 2023, pp. 35–50

work page 2023
[6]

T. Alsop. (2023) Augmented reality (ar) - statistics & facts. Accessed: 25-11-2025. [Online]. Available: https://www.statista.com/topics/3286/ augmented-reality-ar/#topicOverview

work page 2023
[7]

Itu-t recommendation g.1036,

International Telecommunication Union, “Itu-t recommendation g.1036,” https://www.itu.int/rec/T-REC-G.1036-202207-I, 2022

work page 2022
[8]

Assessment of 3d models placement methods in augmented reality,

N. El Barhoumi, R. Hajji, Z. Bouali, Y . Ben Brahim, and A. Kharroubi, “Assessment of 3d models placement methods in augmented reality,” Applied Sciences, vol. 12, no. 20, p. 10620, 2022

work page 2022
[9]

M. J. Burke. (2025) Double vision. Online; accessed 25-Nov-

work page 2025
[10]

Available: https://www.drmilesburke.com/eye-condition/ eye-alignment-disorders-strabismus/double-vision/

[Online]. Available: https://www.drmilesburke.com/eye-condition/ eye-alignment-disorders-strabismus/double-vision/

work page
[11]

Confusing image quality assessment: Toward better augmented reality experience,

H. Duan, X. Min, Y . Zhu, G. Zhai, X. Yang, and P. Le Callet, “Confusing image quality assessment: Toward better augmented reality experience,” IEEE Transactions on Image Processing, vol. 31, pp. 7206–7221, 2022

work page 2022
[12]

Subjective and objective visual quality assessment of textured 3D meshes,

J. Guo, V . Vidal, I. Cheng, A. Basu, and et al., “Subjective and objective visual quality assessment of textured 3D meshes,”ACM Trans. on Applied Perception, vol. 14, no. 2, pp. 1–20, 2016

work page 2016
[13]

Towards subjective quality assessment of point cloud imaging in augmented reality,

E. Alexiou, E. Upenik, and T. Ebrahimi, “Towards subjective quality assessment of point cloud imaging in augmented reality,” inIEEE 19th Int.Workshop on Multimedia Signal Processing, 2017, pp. 1–6

work page 2017
[14]

Subjective and objective quality assessment for augmented reality images,

P. Wang, H. Duan, Z. Xie, X. Min, and G. Zhai, “Subjective and objective quality assessment for augmented reality images,”IEEE Open Journal on Immersive Displays, 2024

work page 2024
[15]

Effect of transparency levels and real-world backgrounds on the user interface in augmented reality environments,

M. Hussain and J. Park, “Effect of transparency levels and real-world backgrounds on the user interface in augmented reality environments,” Human–Computer Interaction, vol. 40, no. 16, pp. 4265–4274, 2024

work page 2024
[16]

Enhancing visual perception in immersive VR and AR environments: AI-driven color and clarity adjustments under dynamic lighting conditions,

M. Abbasi, P. V ´az, J. Silva, and P. Martins, “Enhancing visual perception in immersive VR and AR environments: AI-driven color and clarity adjustments under dynamic lighting conditions,”Technologies, vol. 12, no. 11, p. 216, 2024

work page 2024
[17]

Recommendation ITU-R BT.500-15: Methodologies for the subjective assessment of the quality of television images,

“Recommendation ITU-R BT.500-15: Methodologies for the subjective assessment of the quality of television images,” International Telecom- munication Union, Tech. Rep. BT.500-15, 2023

work page 2023
[18]

ITU-T Study Group 12,P .910: Subjective video quality assessment methods for multimedia applications. Int. Telecom. Union, 2023

work page 2023
[19]

Virtual reality sickness questionnaire (vrsq): Motion sickness measurement index in a virtual reality environment,

H. K. Kim, J. Park, Y . Choi, and M. Choe, “Virtual reality sickness questionnaire (vrsq): Motion sickness measurement index in a virtual reality environment,”Applied ergonomics, vol. 69, pp. 66–73, 2018

work page 2018
[20]

The perceptual science of augmented reality,

E. A. Cooper, “The perceptual science of augmented reality,”Annual Review of Vision Science, vol. 9, pp. 455–478, 2023

work page 2023
[21]

Perceptually driven nonuniform asym- metric coding of stereoscopic 3d video,

S. A. Fezza and M.-C. Larabi, “Perceptually driven nonuniform asym- metric coding of stereoscopic 3d video,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 10, pp. 2231–2245, 2017

work page 2017
[22]

Basics: Broad quality assessment of static point clouds in a compression scenario,

A. Ak, E. Zerman, M. Quach, A. Chetouani, and et al., “Basics: Broad quality assessment of static point clouds in a compression scenario,” IEEE Transactions on Multimedia, vol. 26, pp. 6730–6742, 2024

work page 2024
[23]

Measuring colorfulness in natural images,

D. Hasler and S. E. Suesstrunk, “Measuring colorfulness in natural images,” inHuman vision and electronic imaging VIII, vol. 5007. SPIE, 2003, pp. 87–95

work page 2003
[24]

Kadid-10k: A large-scale artificially distorted iqa database,

H. Lin, V . Hosu, and D. Saupe, “Kadid-10k: A large-scale artificially distorted iqa database,” in2019 Tenth International Conference on Quality of Multimedia Experience (QoMEX), 2019, pp. 1–3

work page 2019
[25]

On the influence of head-mounted displays on quality rating of omnidirectional images,

A. Sendjasni, M.-C. Larabi, and F. A. Cheikh, “On the influence of head-mounted displays on quality rating of omnidirectional images,” Ph.D. dissertation, 2021

work page 2021
[26]

V . Oy. (2023) Varjo vr-3 — headsets. Accessed: 19 Nov 2025. [Online]. Available: https://varjo.com/products/varjo-vr-3

work page 2023
[27]

Sos: The mos is not enough!

T. Hoßfeld, R. Schatz, and S. Egger, “Sos: The mos is not enough!” in Int. Workshop on Quality of Multimedia Experience, 2011, pp. 131–136

work page 2011
[28]

Comparison of subjective methods for quality assessment of 3D graphics in virtual reality,

Y . Nehm ´e, J.-P. Farrugia, F. Dupontet al., “Comparison of subjective methods for quality assessment of 3D graphics in virtual reality,”ACM Trans. on Applied Perception, vol. 18, no. 1, pp. 1–23, 2020

work page 2020
[29]

Subjective test methodologies for 360° video on head-mounted displays,

Int. Telecommunication Union, “Subjective test methodologies for 360° video on head-mounted displays,” ITU-T, Recommendation P.919, 2020

work page 2020
[30]

Virtual reality sickness: A review of causes and measurements,

E. Chang, H. T. Kim, and B. Yoo, “Virtual reality sickness: A review of causes and measurements,”Human–Computer Interaction, vol. 36, no. 17, pp. 1658–1682, 2020

work page 2020
[31]

Subjective evaluation of visual quality and simulator sickness of short 360 ◦ videos: ITU-T rec. P.919,

J. Gutierrez, P. Perez, M. Ordunaet al., “Subjective evaluation of visual quality and simulator sickness of short 360 ◦ videos: ITU-T rec. P.919,” IEEE transactions on multimedia, vol. 24, pp. 3087–3100, 2021

work page 2021
[32]

Investigation and modeling of visual fatigue caused by S3D content using eye-tracking,

I. Iatsun, M.-C. Larabi, and C. Fernandez-Maloigne, “Investigation and modeling of visual fatigue caused by S3D content using eye-tracking,” Displays, vol. 39, pp. 11–25, 2015

work page 2015
[33]

Towards light-weight transformer-based quality assessment metric for augmented reality,

A. Sekhri, S. A. Amirshahi, and M.-C. Larabi, “Towards light-weight transformer-based quality assessment metric for augmented reality,” in IEEE Int. Workshop on Multimedia Signal Processing, 2024, pp. 1–6

work page 2024
[34]

ARaBIQA: A novel blind image quality assessment model for augmented reality,

A. Sekhri, M.-C. Larabi, and S. A. Amirshahi, “ARaBIQA: A novel blind image quality assessment model for augmented reality,” inIEEE International Conference on Image Processing, 2025, pp. 379–384. Aymen Sekhriwas born in Tamalous, Algeria, in

work page 2025
[35]

He is currently pursuing a Ph.D

He received his engineering degree from the National Higher School of Telecommunications and ICT (ENSTTIC), Oran, in 2023. He is currently pursuing a Ph.D. degree jointly with the University of Poitiers, France, and the Norwegian University of Science and Technology (NTNU), Norway. His research interests include quality assessment for immersive media, spe...

work page 2023
[36]

He supervised or is supervising 20+ PhDs, and he has published over 200 papers

His scientific interests span different fields of image and video processing, including quality as- sessment, compression, optimization, and enhance- ment, traditional and learning-based, for various types of content, including immersive media. He supervised or is supervising 20+ PhDs, and he has published over 200 papers. He participated as a PI on vario...

work page