arxiv: 2605.05155 · v1 · submitted 2026-05-06 · 💻 cs.CV · cs.AI

Recognition: unknown

Aes3D: Aesthetic Assessment in 3D Gaussian Splatting

Chuanzhi Xu , Boyu Wei , Haoxian Zhou , Xuanhua Yin , Zihan Deng , Haodong Chen , Qiang Qu , Weidong Cai

Authors on Pith no claims yet

Pith reviewed 2026-05-08 16:49 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords 3D Gaussian SplattingAesthetic Assessment3D Scene EvaluationNeural RenderingAesthetic DatasetLightweight ModelImmersive Media

0 comments

The pith

A lightweight model predicts aesthetic scores for 3D scenes directly from Gaussian splat primitives without rendering images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the gap in evaluating 3D neural rendering scenes for qualities like composition, harmony, and visual appeal rather than just reconstruction accuracy. It introduces the Aes3D framework, which includes a new dataset called Aesthetic3D annotated specifically for 3D scene aesthetics and a model called Aes3DGSNet that learns to regress scene-level scores from multi-view 3D Gaussian representations. The approach uses aesthetics-supervised learning to capture high-level cues from low-level primitives, and experiments show it delivers strong performance in a compact form. If correct, this would let creators assess and refine 3D content for appeal during development while cutting the need for full image rendering pipelines. It also sets an initial benchmark for systematic 3D aesthetic assessment in immersive media.

Core claim

We propose Aes3D, the first systematic framework for assessing the aesthetics of 3D neural rendering scenes. Aes3D includes Aesthetic3D, the first dataset dedicated to 3D scene aesthetic assessment, built on our proposed annotation strategy for 3D scene aesthetics. In addition, we present Aes3DGSNet, a lightweight model that directly predicts scene-level aesthetic scores from 3DGS representations. Notably, our model operates solely on 3D Gaussian primitives, eliminating the need for rendering multi-view images and thus reducing computational cost and hardware requirements. Through aesthetics-supervised learning on multi-view 3DGS scene representations, Aes3DGSNet effectively captures high-l

What carries the argument

Aes3DGSNet, a lightweight network that takes 3D Gaussian primitives as input and regresses scene-level aesthetic scores via aesthetics-supervised learning on multi-view representations.

If this is right

Creators of 3D content can obtain aesthetic feedback without rendering full multi-view images, lowering compute and hardware demands.
The Aesthetic3D dataset provides a public resource for training and evaluating future 3D aesthetic assessment methods.
Aes3DGSNet establishes a new performance benchmark for lightweight, direct-from-primitives aesthetic scoring in 3DGS scenes.
The method supports iterative refinement of visually compelling 3D scenes in immersive media and digital content pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same primitive-to-score mapping could be adapted to other explicit 3D representations such as point clouds or meshes if the input encoding is adjusted.
Automated aesthetic optimization loops could be built on top of the scorer to adjust Gaussian parameters toward higher predicted appeal.
Extending the annotation strategy to dynamic or animated 3DGS scenes would test whether the learned cues generalize beyond static views.

Load-bearing premise

High-level aesthetic attributes such as composition and harmony can be accurately regressed from the raw low-level attributes of 3D Gaussian primitives alone.

What would settle it

Human raters on a held-out set of 3DGS scenes give scores that show no correlation with the model's predictions, or the model matches or underperforms a simple baseline that ignores the 3D structure.

Figures

Figures reproduced from arXiv: 2605.05155 by Boyu Wei, Chuanzhi Xu, Haodong Chen, Haoxian Zhou, Qiang Qu, Weidong Cai, Xuanhua Yin, Zihan Deng.

**Figure 1.** Figure 1: Aes3D includes a method for IAA-based aesthetic annotation of 3D Scene datasets, upon which the Aesthetic3D dataset is constructed. It also includes Aes3DGSNet, a model capable of evaluating the aesthetic scores of 3DGS scenes. Some scoring examples are shown below. Image Aesthetic Assessment (IAA) has long been an important problem in computer vision, aiming to quantify subjective quality in human visual … view at source ↗

**Figure 2.** Figure 2: Overview of IAA-based Annotation for constructing Aesthetic3D. view at source ↗

**Figure 3.** Figure 3: Statistical overview of Aesthetic3D (8-attr mean). view at source ↗

**Figure 4.** Figure 4: Overview of Aes3DGSNet. distribution, and visual balance, are inherently view-dependent. A single global 3D summary may miss such a view-conditioned evidence, whereas uniformly averaging all candidate views may dilute informative observations with redundant or low-quality views. In practice, the input multiple views used during Aesthetic3D annotation may not coincide with those available at inference time,… view at source ↗

**Figure 5.** Figure 5: Score distributions of different IAA annotators on Aesthetic3D Dataset. ArtiMuse shows view at source ↗

**Figure 6.** Figure 6: Distribution of scene-level total scores (left) and within-scene total-score gaps (right). The view at source ↗

**Figure 7.** Figure 7: Distribution of ArtiMuse attribute scores across datasets. Box plots of the eight attribute-level scores for DL3DV-10K and Bilarf. Bilarf exhibits consistently higher score ranges across all attributes, indicating a clear distribution shift toward higher aesthetic quality. Due to the small sample size of Bilarf, the statistics mainly reveal the existence of dataset bias rather than a stable estimate of its… view at source ↗

**Figure 8.** Figure 8: Pairwise Pearson correlations among the eight aesthetic attributes. Left: DL3DV-10K shows consistently high correlations (0.86–0.99), indicating strong collinearity among attributes. Right: Bilarf exhibits similar trends, but originality shows weaker correlations with other attributes, suggesting partial independence under certain conditions. Cross-dataset Distribution Shift. We first examine the distribut… view at source ↗

**Figure 9.** Figure 9: Correlation between attribute-level scores and the holistic aesthetic score. On DL3DV10K, all attributes show very strong correlations with the total score (Pearson ∼ 0.94–0.98). On Bilarf, ranking consistency remains high (Spearman ≈ 1.0), while linear correlations vary more, especially for originality. This indicates strong ordinal consistency but dataset-dependent linear relationships. highlights the p… view at source ↗

**Figure 10.** Figure 10: Visualizations of Data Annotation Examples (1). view at source ↗

**Figure 11.** Figure 11: Visualizations of Data Annotation Examples (2). view at source ↗

**Figure 12.** Figure 12: Screenshots of the custom rating interface used in the human study. The interface provides view at source ↗

**Figure 13.** Figure 13: Alignment between Human Study ratings and proxy aesthetic scores. Left: comparison of the marginal distributions of Human Study ratings and the proxy Score. Right: paired agreement between integer Score grades and integer Human Study ratings, where bubble size indicates the number of participant-scene ratings at each pair. Most ratings concentrate near the diagonal in the mid-score range, indicating stron… view at source ↗

**Figure 14.** Figure 14: Some visualization examples of using Aes3DGSNet for aesthetic assessment of view at source ↗

read the original abstract

As 3D Gaussian Splatting (3DGS) gains attention in immersive media and digital content creation, assessing the aesthetics of 3D scenes becomes important in helping creators build more visually compelling 3D content. However, existing evaluation methods for 3D scenes primarily emphasize reconstruction fidelity and perceptual realism, largely overlooking higher-level aesthetic attributes such as composition, harmony, and visual appeal. This limitation comes from two key challenges: (1) the absence of general 3DGS datasets with aesthetic annotations, and (2) the intrinsic nature of 3DGS as a low-level primitive representation, which makes it difficult to capture high-level aesthetic features. To address these challenges, we propose Aes3D, the first systematic framework for assessing the aesthetics of 3D neural rendering scenes. Aes3D includes Aesthetic3D, the first dataset dedicated to 3D scene aesthetic assessment, built on our proposed annotation strategy for 3D scene aesthetics. In addition, we present Aes3DGSNet, a lightweight model that directly predicts scene-level aesthetic scores from 3DGS representations. Notably, our model operates solely on 3D Gaussian primitives, eliminating the need for rendering multi-view images and thus reducing computational cost and hardware requirements. Through aesthetics-supervised learning on multi-view 3DGS scene representations, Aes3DGSNet effectively captures high-level aesthetic cues and accurately regresses aesthetic scores. Experimental results demonstrate that our approach achieves strong performance while maintaining a lightweight design, establishing a new benchmark for 3D scene aesthetic assessment. Code and datasets will be made available in a future version.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces the first 3D-specific aesthetic dataset and a lightweight model that scores scenes directly from Gaussian primitives, but the abstract supplies no metrics or architecture details to back the performance claims.

read the letter

The core offering here is Aesthetic3D, a new dataset with 3D-tailored annotations, plus Aes3DGSNet, which regresses scene-level aesthetic scores straight from the 3DGS primitives without any rendering step at inference. That setup is genuinely new in the 3DGS literature, where most work still focuses on reconstruction quality rather than higher-order visual appeal like composition or harmony. The lightweight design and the avoidance of multi-view rendering at test time are practical advantages for creators working in immersive media or digital content pipelines. Training on multi-view representations while keeping inference cheap is a reasonable engineering choice on paper. The abstract positions this as establishing a new benchmark, and the gap it targets is real enough that people building 3D scenes could use a tool like this once it is validated. The main soft spots are the complete absence of numbers, dataset size, error bars, or model specifics in the provided text, which makes it impossible to judge whether the strong performance claim is supported. More importantly, the central assumption that unordered 3D primitives alone can reliably encode viewpoint-dependent aesthetic properties without an explicit projection or aggregation mechanism looks fragile. Aesthetics such as framing and occlusion are defined on 2D views, so if the network relies only on global statistics over the primitive set, the multi-view training signal may not fully compensate. This is not a fatal flaw yet, but it needs direct evidence. The work is aimed at researchers in neural rendering and 3D content creation who want to move beyond fidelity metrics. Readers already working with 3DGS pipelines would get the most immediate value once the experiments are filled in. It deserves a serious referee because it opens a clear new direction with a concrete dataset and model, even if the current version will require substantial revision on the experimental side. I would send it to review rather than desk reject.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes Aes3D as the first systematic framework for assessing aesthetics of 3D neural rendering scenes represented via 3D Gaussian Splatting. It introduces the Aesthetic3D dataset (the first dedicated to 3D scene aesthetic assessment, built via a proposed annotation strategy) and Aes3DGSNet, a lightweight model that directly regresses scene-level aesthetic scores (for attributes such as composition, harmony, and visual appeal) from 3DGS primitives alone, without rendering multi-view images. The model is trained via aesthetics-supervised learning on multi-view 3DGS representations, and the abstract claims that experiments demonstrate strong performance while maintaining a lightweight design, establishing a new benchmark.

Significance. If the central claims hold, the work would be significant for the field of neural rendering and immersive media by shifting evaluation focus from reconstruction fidelity to higher-level aesthetic attributes. The creation of a dedicated dataset and an efficient model that avoids rendering costs could enable practical tools for content creators. The use of multi-view training signals to learn from unordered 3D primitives is a potentially useful direction, though its soundness depends on validation details not supplied in the available text.

major comments (2)

[Abstract] Abstract (final paragraph): The load-bearing claim that Aes3DGSNet 'directly predicts scene-level aesthetic scores from 3DGS representations' and 'operates solely on 3D Gaussian primitives, eliminating the need for rendering multi-view images' lacks any description of the architecture, aggregation mechanism over the unordered primitive set, or how view-dependent cues (occlusion, framing, lighting) are recovered. This is a correctness risk because aesthetic attributes are defined on 2D projections, and global statistics over primitives may not suffice even with multi-view training supervision.
[Abstract] Abstract (experimental results sentence): No quantitative metrics, dataset statistics (scene count, annotation protocol, inter-annotator agreement), model size, baselines, train/test splits, or error bars are reported, making it impossible to evaluate the 'strong performance' or 'lightweight design' assertions or to determine whether the data supports the new-benchmark claim.

minor comments (1)

[Abstract] Abstract: The promise that 'Code and datasets will be made available in a future version' is positive, but the manuscript should supply at least high-level dataset statistics and annotation guidelines to allow readers to assess the annotation strategy's reliability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on the abstract. We respond to each major comment below and indicate planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract (final paragraph): The load-bearing claim that Aes3DGSNet 'directly predicts scene-level aesthetic scores from 3DGS representations' and 'operates solely on 3D Gaussian primitives, eliminating the need for rendering multi-view images' lacks any description of the architecture, aggregation mechanism over the unordered primitive set, or how view-dependent cues (occlusion, framing, lighting) are recovered. This is a correctness risk because aesthetic attributes are defined on 2D projections, and global statistics over primitives may not suffice even with multi-view training supervision.

Authors: We agree the abstract is highly condensed and omits these details. The full manuscript (Section 3) specifies that Aes3DGSNet uses a lightweight set-based aggregator (permutation-invariant operations over the 3D Gaussian attributes) trained with multi-view aesthetic supervision; view-dependent effects are learned implicitly through the supervision signal rather than explicit rendering at inference. We will revise the abstract to include a short clause describing the aggregation mechanism and the multi-view training strategy. revision: yes
Referee: [Abstract] Abstract (experimental results sentence): No quantitative metrics, dataset statistics (scene count, annotation protocol, inter-annotator agreement), model size, baselines, train/test splits, or error bars are reported, making it impossible to evaluate the 'strong performance' or 'lightweight design' assertions or to determine whether the data supports the new-benchmark claim.

Authors: The abstract follows conventional length constraints by summarizing results at a high level. All requested quantitative information (dataset size and annotation protocol, inter-annotator agreement, model parameter count, baseline comparisons, cross-validation splits, and error bars) appears in the Experiments section. We will revise the abstract to incorporate one or two key quantitative highlights (e.g., correlation with human ratings and parameter count) while respecting word limits. revision: yes

Circularity Check

0 steps flagged

No significant circularity; new dataset and model are independent contributions

full rationale

The paper introduces a new dataset (Aesthetic3D) with a proposed annotation strategy and a new lightweight network (Aes3DGSNet) trained via supervised learning on multi-view 3DGS data to regress scene-level aesthetic scores directly from Gaussian primitives. No mathematical derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing steps are present in the provided text. The central claims rest on empirical construction of the dataset and architecture rather than any reduction of outputs to inputs by definition or prior self-referential results. The approach is self-contained against external benchmarks of dataset creation and model performance.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not specify any free parameters, background axioms, or newly postulated entities. The contributions consist of a new annotated dataset and a neural network model for aesthetic prediction.

pith-pipeline@v0.9.0 · 5617 in / 1423 out tokens · 90771 ms · 2026-05-08T16:49:07.194983+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references · 12 canonical work pages

[1]

A survey on image aesthetic assessment

Abbas Anwar, Saira Kanwal, Muhammad Tahir, Muhammad Saqib, Muhammad Uzair, Mo- hammad Khalid Imam Rahmani, and Habib Ullah. A survey on image aesthetic assessment. arXiv preprint arXiv:2103.11616, 2021

work page arXiv 2021
[2]

Charm: the missing piece in vit fine-tuning for image aesthetic assessment

Fatemeh Behrad, Tinne Tuytelaars, and Johan Wagemans. Charm: the missing piece in vit fine-tuning for image aesthetic assessment. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 7815–7824, 2025

2025
[3]

Flexible frame selection for efficient video reasoning

Shyamal Buch, Arsha Nagrani, Anurag Arnab, and Cordelia Schmid. Flexible frame selection for efficient video reasoning. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 29071–29082, 2025

2025
[4]

Artimuse: Fine-grained image aesthetics assessment with joint scoring and expert-level understanding

Shuo Cao, Ning Ma, Jiayang Li, Xiaolong Li, Ling Shao, Kaiwen Zhu, Yu Zhou, Yanfeng Wang, et al. Artimuse: Fine-grained image aesthetics assessment with joint scoring and expert-level understanding. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

2026
[5]

Composition and style attributes guided image aesthetic assessment.IEEE Transactions on Image Processing, 31: 5009–5024, 2022

Luigi Celona, Marco Leonardi, Paolo Napoletano, and Alessandro Rozza. Composition and style attributes guided image aesthetic assessment.IEEE Transactions on Image Processing, 31: 5009–5024, 2022

2022
[6]

A survey on 3d gaussian splatting.ACM Computing Surveys,

Guikun Chen and Wenguan Wang. A survey on 3d gaussian splatting.ACM Computing Surveys,
[7]

doi: 10.1145/3807511

ISSN 0360-0300. doi: 10.1145/3807511. URL https://doi.org/10.1145/3807511

work page doi:10.1145/3807511
[8]

Mugsqa: Novel multi- uncertainty-based gaussian splatting quality assessment method, dataset, and benchmarks

Tianang Chen, Jian Jin, Shilv Cai, Zhuangzi Li, and Weisi Lin. Mugsqa: Novel multi- uncertainty-based gaussian splatting quality assessment method, dataset, and benchmarks. In ICASSP 2026-2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 11737–11741. IEEE, 2026

2026
[9]

Point cloud self-supervised learning via 3d to multi-view masked learner

Zhimin Chen, Xuewei Chen, Xiao Guo, Yingwei Li, Longlong Jing, Liang Yang, and Bing Li. Point cloud self-supervised learning via 3d to multi-view masked learner. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 27618–27629, 2025

2025
[10]

Deep learning based image aesthetic quality assessment- a review.ACM Computing Surveys, 57(7), February 2025

Maedeh Daryanavard Chounchenani, Asadollah Shahbahrami, Reza Hassanpour, and Georgi Gaydadjiev. Deep learning based image aesthetic quality assessment- a review.ACM Computing Surveys, 57(7), February 2025. ISSN 0360-0300. doi: 10.1145/3716820. URL https: //doi.org/10.1145/3716820

work page doi:10.1145/3716820 2025
[11]

Studying aesthetics in photographic images using a computational approach

Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z Wang. Studying aesthetics in photographic images using a computational approach. InEuropean conference on computer vision, pages 288–301. Springer, 2006

2006
[12]

Image aesthetic assessment: An experimental survey.IEEE Signal Processing Magazine, 34(4):80–106, 2017

Yubin Deng, Chen Change Loy, and Xiaoou Tang. Image aesthetic assessment: An experimental survey.IEEE Signal Processing Magazine, 34(4):80–106, 2017. doi: 10.1109/MSP.2017. 2696576

work page doi:10.1109/msp.2017 2017
[13]

Aesthetic-driven image enhancement by adversarial learning

Yubin Deng, Chen Change Loy, and Xiaoou Tang. Aesthetic-driven image enhancement by adversarial learning. InProceedings of the 26th ACM international conference on Multimedia, pages 870–878, 2018. 10

2018
[14]

A3gs: Arbitrary artistic style into arbitrary 3d gaussian splatting

Zhiyuan Fang, Rengan Xie, Xuancheng Jin, Qi Ye, Wei Chen, Wenting Zheng, Rui Wang, and Yuchi Huo. A3gs: Arbitrary artistic style into arbitrary 3d gaussian splatting. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 17751–17760, October 2025

2025
[15]

3d gaussian splatting as new era: A survey.IEEE Transactions on Visualization and Computer Graphics, 2024

Ben Fei, Jingyi Xu, Rui Zhang, Qingyuan Zhou, Weidong Yang, and Ying He. 3d gaussian splatting as new era: A survey.IEEE Transactions on Visualization and Computer Graphics, 2024

2024
[16]

Rethinking image aesthetics assessment: Models, datasets and benchmarks

Shuai He, Yongchang Zhang, Rui Xie, Dongxiang Jiang, and Anlong Ming. Rethinking image aesthetics assessment: Models, datasets and benchmarks. InProceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, pages 942–948, 2022. doi: 10.24963/ijcai.2022/132. URLhttps://doi.org/10.24963/ijcai.2022/132

work page doi:10.24963/ijcai.2022/132 2022
[17]

Effective aesthetics prediction with multi- level spatially pooled features

Vlad Hosu, Bastian Goldlucke, and Dietmar Saupe. Effective aesthetics prediction with multi- level spatially pooled features. Inproceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9375–9383, 2019

2019
[18]

Aesexpert: Towards multi-modality foundation model for image aesthetics perception

Yipo Huang, Xiangfei Sheng, Zhichao Yang, Quan Yuan, Zhichao Duan, Pengfei Chen, Leida Li, Weisi Lin, and Guangming Shi. Aesexpert: Towards multi-modality foundation model for image aesthetics perception. InProceedings of the 32nd ACM International Conference on Multimedia, pages 5911–5920, 2024

2024
[19]

Robust estimation of a location parameter

Peter J Huber. Robust estimation of a location parameter. InBreakthroughs in statistics: Methodology and distribution, pages 492–518. Springer, 1992

1992
[20]

Aesthetic attributes assessment of images

Xin Jin, Le Wu, Geng Zhao, Xiaodong Li, Xiaokun Zhang, Shiming Ge, Dongqing Zou, Bin Zhou, and Xinghui Zhou. Aesthetic attributes assessment of images. InProceedings of the 27th ACM international conference on multimedia, pages 311–319, 2019

2019
[21]

Optimizing search engines using clickthrough data

Thorsten Joachims. Optimizing search engines using clickthrough data. InProceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 133–142, 2002

2002
[22]

Musiq: Multi-scale image quality transformer

Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. Musiq: Multi-scale image quality transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 5148–5157, 2021

2021
[23]

Vila: Learning image aesthetics from user comments with vision-language pretraining

Junjie Ke, Keren Ye, Jiahui Yu, Yonghui Wu, Peyman Milanfar, and Feng Yang. Vila: Learning image aesthetics from user comments with vision-language pretraining. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10041–10051, 2023

2023
[24]

The design of high-level features for photo quality assessment

Yan Ke, Xiaoou Tang, and Feng Jing. The design of high-level features for photo quality assessment. In2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 1, pages 419–426. IEEE, 2006

2006
[25]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

2023
[26]

Tanks and temples: Bench- marking large-scale scene reconstruction.ACM Transactions on Graphics, 36(4):78:1–78:13, 2017

Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. Tanks and temples: Bench- marking large-scale scene reconstruction.ACM Transactions on Graphics, 36(4):78:1–78:13, 2017

2017
[27]

Photo aesthetics ranking network with attributes and content adaptation

Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. Photo aesthetics ranking network with attributes and content adaptation. InEuropean Conference on Computer Vision (ECCV), pages 662–679. Springer, 2016

2016
[28]

No-reference geometry quality assessment for colorless point clouds via list-wise rank learning.Computers & Graphics, 127: 104176, 2025

Zheng Li, Bingxu Xie, Chao Chu, Weiqing Li, and Zhiyong Su. No-reference geometry quality assessment for colorless point clouds via list-wise rank learning.Computers & Graphics, 127: 104176, 2025. 11

2025
[29]

Perceptual quality assessment of nerf and neural view synthesis methods for front-facing views

Hanxue Liang, Tianhao Wu, Param Hanji, Francesco Banterle, Hongyun Gao, Rafal Mantiuk, and Cengiz Öztireli. Perceptual quality assessment of nerf and neural view synthesis methods for front-facing views. InComputer Graphics Forum, volume 43, page e15036. Wiley Online Library, 2024

2024
[30]

Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision

Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, et al. Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

2024
[31]

Rapid: Rating pictorial aesthetics using deep learning

Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z Wang. Rapid: Rating pictorial aesthetics using deep learning. InProceedings of the 22nd ACM international conference on Multimedia, pages 457–466, 2014

2014
[32]

Rating image aesthetics using deep learning.IEEE Transactions on Multimedia, 17(11):2021–2034, 2015

Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, and James Z Wang. Rating image aesthetics using deep learning.IEEE Transactions on Multimedia, 17(11):2021–2034, 2015

2021
[33]

User-guided personalized image aesthetic assessment based on deep reinforcement learning.IEEE Transactions on Multimedia, 25:736–749, 2021

Pei Lv, Jianqi Fan, Xixi Nie, Weiming Dong, Xiaoheng Jiang, Bing Zhou, Mingliang Xu, and Changsheng Xu. User-guided personalized image aesthetic assessment based on deep reinforcement learning.IEEE Transactions on Multimedia, 25:736–749, 2021

2021
[34]

Discovering beautiful attributes for aesthetic image analysis.International journal of computer vision, 113(3):246–266, 2015

Luca Marchesotti, Naila Murray, and Florent Perronnin. Discovering beautiful attributes for aesthetic image analysis.International journal of computer vision, 113(3):246–266, 2015

2015
[35]

Nerf view synthesis: Subjective quality assessment and objective metrics evaluation.IEEE Access, 13:26–41, 2024

Pedro Martin, António Rodrigues, João Ascenso, and Maria Paula Queluz. Nerf view synthesis: Subjective quality assessment and objective metrics evaluation.IEEE Access, 13:26–41, 2024

2024
[36]

Gs-qa: Compre- hensive quality assessment benchmark for gaussian splatting view synthesis

Pedro Martin, António Rodrigues, João Ascenso, and Maria Paula Queluz. Gs-qa: Compre- hensive quality assessment benchmark for gaussian splatting view synthesis. In2025 17th International Conference on Quality of Multimedia Experience (QoMEX), pages 1–7. IEEE, 2025

2025
[37]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InEuropean Conference on Computer Vision (ECCV), 2020

2020
[38]

Nerf: Representing scenes as neural radiance fields for view synthesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

2021
[39]

Ava: A large-scale database for aesthetic visual analysis

Naila Murray, Luca Marchesotti, and Florent Perronnin. Ava: A large-scale database for aesthetic visual analysis. In2012 IEEE conference on computer vision and pattern recognition, pages 2408–2415. IEEE, 2012

2012
[40]

Drop-in perceptual optimization for 3d gaussian splatting.arXiv preprint arXiv:2603.23297, 2026

Ezgi Ozyilkan, Zhiqi Chen, Oren Rippel, Jona Ballé, and Kedar Tatwawadi. Drop-in perceptual optimization for 3d gaussian splatting.arXiv preprint arXiv:2603.23297, 2026

work page arXiv 2026
[41]

Distort-and-recover: Color enhancement using deep reinforcement learning

Jongchan Park, Joon-Young Lee, Donggeun Yoo, and In So Kweon. Distort-and-recover: Color enhancement using deep reinforcement learning. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 5928–5936, 2018

2018
[42]

Pointnet++: Deep hierarchical feature learning on point sets in a metric space.Advances in neural information processing systems, 30, 2017

Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space.Advances in neural information processing systems, 30, 2017

2017
[43]

Nerf-nqa: No- reference quality assessment for scenes generated by nerf and neural view synthesis methods

Qiang Qu, Hanxue Liang, Xiaoming Chen, Yuk Ying Chung, and Yiran Shen. Nerf-nqa: No- reference quality assessment for scenes generated by nerf and neural view synthesis methods. IEEE Transactions on Visualization and Computer Graphics, 30(5):2129–2139, 2024

2024
[44]

Nvs-sqa: Exploring self-supervised quality representation learning for neurally synthesized scenes without references.arXiv preprint arXiv:2501.06488, 2025

Qiang Qu, Yiran Shen, Xiaoming Chen, Yuk Ying Chung, Weidong Cai, and Tongliang Liu. Nvs-sqa: Exploring self-supervised quality representation learning for neurally synthesized scenes without references.arXiv preprint arXiv:2501.06488, 2025. 12

work page arXiv 2025
[45]

Personalized image aesthetics

Jian Ren, Xiaohui Shen, Zhe Lin, Radomir Mech, and David J Foran. Personalized image aesthetics. InProceedings of the IEEE international conference on computer vision, pages 638–647, 2017

2017
[46]

Hierarchical layout-aware graph convolu- tional network for unified aesthetics assessment

Dongyu She, Yu-Kun Lai, Gaoxiong Yi, and Kun Xu. Hierarchical layout-aware graph convolu- tional network for unified aesthetics assessment. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8475–8484, 2021

2021
[47]

Attention-based multi-patch aggregation for image aesthetic assessment

Kekai Sheng, Weiming Dong, Chongyang Ma, Xing Mei, Feiyue Huang, and Bao-Gang Hu. Attention-based multi-patch aggregation for image aesthetic assessment. InProceedings of the 26th ACM international conference on Multimedia, pages 879–886, 2018

2018
[48]

Nima: Neural image assessment.IEEE transactions on image processing, 27(8):3998–4011, 2018

Hossein Talebi and Peyman Milanfar. Nima: Neural image assessment.IEEE transactions on image processing, 27(8):3998–4011, 2018

2018
[49]

Perceptual quality assessment of 3d gaussian splatting: A subjective dataset and prediction metric

Zhaolin Wan, Yining Diao, Jingqi Xu, Hao Wang, Zhiyang Li, Xiaopeng Fan, Wangmeng Zuo, and Debin Zhao. Perceptual quality assessment of 3d gaussian splatting: A subjective dataset and prediction metric. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 9657–9665, 2026

2026
[50]

Of-nerf: A subjective benchmark of perceptual quality assessment for outward-facing nerf scenes with mul- tiple distortions and diverse viewing trajectories

Qian Wang, Zongju Peng, Wenhui Zou, Fen Chen, Kai Xu, and Youshuang Zhao. Of-nerf: A subjective benchmark of perceptual quality assessment for outward-facing nerf scenes with mul- tiple distortions and diverse viewing trajectories. InProceedings of the 7th ACM International Conference on Multimedia in Asia, pages 1–7, 2025

2025
[51]

Bilateral guided radiance field processing.ACM Transactions on Graphics (TOG), 43(4):1–13, 2024

Yuehao Wang, Chaoyi Wang, Bingchen Gong, and Tianfan Xue. Bilateral guided radiance field processing.ACM Transactions on Graphics (TOG), 43(4):1–13, 2024

2024
[52]

Exploring video quality assessment on user generated contents from aesthetic and technical perspectives

Haoning Wu, Erli Zhang, Liang Liao, Chaofeng Chen, Jingwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, and Weisi Lin. Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. InProceedings of the IEEE/CVF international conference on computer vision, pages 20144–20154, 2023

2023
[53]

Point transformer v3: Simpler faster stronger

Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, and Hengshuang Zhao. Point transformer v3: Simpler faster stronger. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4840–4851, 2024

2024
[54]

Sonata: Self-supervised learning of reliable point representations

Xiaoyang Wu, Daniel DeTone, Duncan Frost, Tianwei Shen, Chris Xie, Nan Yang, Jakob Engel, Richard Newcombe, Hengshuang Zhao, and Julian Straub. Sonata: Self-supervised learning of reliable point representations. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 22193–22204, 2025

2025
[55]

3dgs-ieval- 15k: a large-scale image quality evaluation database for 3d gaussian-splatting

Yuke Xing, Jiarui Wang, Peizhi Niu, Wenjie Huang, Guangtao Zhai, and Yiling Xu. 3dgs-ieval- 15k: a large-scale image quality evaluation database for 3d gaussian-splatting. InProceedings of the 33rd ACM International Conference on Multimedia, pages 12682–12689, 2025

2025
[56]

Depthsplat: Connecting gaussian splatting and depth.arXiv preprint arXiv:2410.13862, 2024

Haofei Xu, Songyou Peng, Fangjinhua Wang, Hermann Blum, Daniel Barath, Andreas Geiger, and Marc Pollefeys. Depthsplat: Connecting gaussian splatting and depth.arXiv preprint arXiv:2410.13862, 2024

work page arXiv 2024
[57]

Stochasticity-aware no-reference point cloud quality assessment

Mingze Xu et al. Stochasticity-aware no-reference point cloud quality assessment. InProceed- ings of the International Joint Conference on Artificial Intelligence (IJCAI), 2025

2025
[58]

Automatic photo adjustment using deep neural networks.ACM Transactions on Graphics (TOG), 35(2):1–15, 2016

Zhicheng Yan, Hao Zhang, Baoyuan Wang, Sylvain Paris, and Yizhou Yu. Automatic photo adjustment using deep neural networks.ACM Transactions on Graphics (TOG), 35(2):1–15, 2016

2016
[59]

A benchmark for gaussian splatting compression and quality assessment study

Qi Yang, Kaifa Yang, Yuke Xing, Yiling Xu, and Zhu Li. A benchmark for gaussian splatting compression and quality assessment study. InProceedings of the 6th ACM International Conference on Multimedia in Asia, pages 1–8, 2024. 13

2024
[60]

Per- sonalized image aesthetics assessment with rich attributes

Yuzhe Yang, Liwu Xu, Leida Li, Nan Qie, Yaqian Li, Peng Zhang, and Yandong Guo. Per- sonalized image aesthetics assessment with rich attributes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19861–19869, 2022

2022
[61]

Video aesthetic quality assessment by temporal integration of photo-and motion-based features.IEEE transactions on multimedia, 15(8):1944–1957, 2013

Hsin-Ho Yeh, Chun-Yu Yang, Ming-Sui Lee, and Chu-Song Chen. Video aesthetic quality assessment by temporal integration of photo-and motion-based features.IEEE transactions on multimedia, 15(8):1944–1957, 2013

1944
[62]

Accelaes: Accelerat- ing diffusion transformers for training-free aesthetic-enhanced image generation.arXiv preprint arXiv:2603.12575, 2026

Xuanhua Yin, Chuanzhi Xu, Haoxian Zhou, Boyu Wei, and Weidong Cai. Accelaes: Accelerat- ing diffusion transformers for training-free aesthetic-enhanced image generation.arXiv preprint arXiv:2603.12575, 2026

work page arXiv 2026
[63]

Stylizedgs: Controllable stylization for 3d gaussian splatting.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(12):11961–11973, 2025

Dingxi Zhang, Yu-Jie Yuan, Zhuoxun Chen, Fang-Lue Zhang, Zhenliang He, Shiguang Shan, and Lin Gao. Stylizedgs: Controllable stylization for 3d gaussian splatting.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(12):11961–11973, 2025. doi: 10.1109/ TPAMI.2025.3604010

work page arXiv 2025
[64]

Mm-pcqa: Multi-modal learning for no- reference point cloud quality assessment

Qi Zhang, Yiling Li, Guangtao Zhai, and Kaifa Yang. Mm-pcqa: Multi-modal learning for no- reference point cloud quality assessment. InProceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2023

2023
[65]

Evaluating human perception of novel view synthesis: Subjective quality assessment of gaussian splatting and nerf in dynamic scenes.arXiv preprint arXiv:2501.08072, 2025

Yuhang Zhang, Joshua Maraval, Zhengyu Zhang, Nicolas Ramin, Shishun Tian, and Lu Zhang. Evaluating human perception of novel view synthesis: Subjective quality assessment of gaussian splatting and nerf in dynamic scenes.arXiv preprint arXiv:2501.08072, 2025

work page arXiv 2025
[66]

and Ni, Z

Hongbi Zhou and Zhangkai Ni. Perceptual-gs: Scene-adaptive perceptual densification for gaussian splatting.arXiv preprint arXiv:2506.12400, 2025

work page arXiv 2025
[67]

modality embedding

Hancheng Zhu, Leida Li, Jinjian Wu, Sicheng Zhao, Guiguang Ding, and Guangming Shi. Personalized image aesthetics assessment via meta-learning with bilevel gradient optimization. IEEE Transactions on Cybernetics, 52(3):1798–1811, 2020. 14 A Broader Impact Aes3D may support aesthetics-aware 3D content creation, quality control, rendering optimization, and ...

2020