Handcrafted vs Deep Learning Classification for Scalable Video QoE Modeling
pith:F2Y2PRVL Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{F2Y2PRVL}
Prints a linked pith:F2Y2PRVL badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
Mobile video traffic is dominant in cellular and enterprise wireless networks. With the advent of diverse applications, network administrators face the challenge to provide high QoE in the face of diverse wireless conditions and application contents. Yet, state-of-the-art networks lack analytics for QoE, as this requires support from the application or user feedback. While there are existing techniques to map QoS to QoE by training machine learning models without requiring user feedback, these techniques are limited to only few applications, due to insufficient QoE ground-truth annotation for ML. To address these limitations, we focus on video telephony applications and model key artefacts of spatial and temporal video QoE. Our key contribution is designing content- and device-independent metrics and training across diverse WiFi conditions. We show that our metrics achieve a median 90% accuracy by comparing with mean-opinion-score from more than 200 users and 800 video samples over three popular video telephony applications -- Skype, FaceTime and Google Hangouts. We further extend our metrics by using deep neural networks, more specifically we use a combined CNN and LSTM model. We achieve a median accuracy of 95% by combining our QoE metrics with the deep learning model, which is a 38% improvement over the state-of-the-art well known techniques.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.