pith. sign in

arxiv: 1802.10240 · v1 · pith:AGMO74SCnew · submitted 2018-02-28 · 💻 cs.CV

Neural Aesthetic Image Reviewer

classification 💻 cs.CV
keywords imageaestheticaestheticsscoreava-reviewscommentsdatasetgenerate
0
0 comments X
read the original abstract

Recently, there is a rising interest in perceiving image aesthetics. The existing works deal with image aesthetics as a classification or regression problem. To extend the cognition from rating to reasoning, a deeper understanding of aesthetics should be based on revealing why a high- or low-aesthetic score should be assigned to an image. From such a point of view, we propose a model referred to as Neural Aesthetic Image Reviewer, which can not only give an aesthetic score for an image, but also generate a textual description explaining why the image leads to a plausible rating score. Specifically, we propose two multi-task architectures based on shared aesthetically semantic layers and task-specific embedding layers at a high level for performance improvement on different tasks. To facilitate researches on this problem, we collect the AVA-Reviews dataset, which contains 52,118 images and 312,708 comments in total. Through multi-task learning, the proposed models can rate aesthetic images as well as produce comments in an end-to-end manner. It is confirmed that the proposed models outperform the baselines according to the performance evaluation on the AVA-Reviews dataset. Moreover, we demonstrate experimentally that our model can generate textual reviews related to aesthetics, which are consistent with human perception.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Aesthetic Attributes Assessment of Images

    cs.CV 2019-07 unverdicted novelty 4.0

    The paper proposes the Aesthetic Multi-Attribute Network (AMAN) that jointly predicts captions and scores for five aesthetic attributes using a new weakly-labeled dataset created via knowledge transfer.