An Algorithm for Transmitting VR Video Based on Adaptive Modulation

Guangtao Zhai; Jie Feng; Ning Liu; Wenjun Zhang; Yongpeng Wu

arxiv: 1906.11402 · v1 · pith:MOJ5YXBFnew · submitted 2019-06-27 · 💻 cs.NI · cs.IT· eess.SP· math.IT

An Algorithm for Transmitting VR Video Based on Adaptive Modulation

Jie Feng , Yongpeng Wu , Guangtao Zhai , Ning Liu , Wenjun Zhang This is my paper

Pith reviewed 2026-05-25 14:32 UTC · model grok-4.3

classification 💻 cs.NI cs.ITeess.SPmath.IT

keywords VR video streamingfield of view predictionadaptive modulationconvex optimizationquality of experiencewireless video transmission360-degree video

0 comments

The pith

A VR video algorithm sends only predicted field-of-view partitions at high quality using convex optimization and adaptive modulation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a transmission strategy for VR videos that avoids sending the entire 360-degree scene at high quality. It predicts each user's field of view, divides the video screen into partitions, and transmits only the partitions likely to appear in view at high quality while using lower quality elsewhere. The approach defines a quality-of-experience metric, models wireless channel fluctuations, and formulates an optimization problem solved via convex optimization with added adaptive modulation to improve bandwidth use. Simulations indicate the resulting algorithm outperforms other VR streaming methods.

Core claim

By dividing the VR video into partitions based on predicted user field of view, defining a QoE metric, modeling the channel, and solving the resulting resource allocation problem through convex optimization combined with adaptive modulation, the transmission uses bandwidth more efficiently while preserving viewing experience under varying wireless conditions.

What carries the argument

The convex optimization formulation that allocates transmission resources and modulation schemes across video partitions to maximize the QoE metric subject to channel and bandwidth constraints.

If this is right

Only the viewed portion of each 360-degree frame needs high-quality transmission, reducing total data sent.
Adaptive modulation adjusts to channel fluctuations while the optimization maintains the QoE target.
The partition-based approach scales the transmission load with the size of the predicted view rather than the full sphere.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If FoV prediction models improve, the same optimization structure could support even lower average bit rates for the same perceived quality.
The method could extend to multi-user scenarios where different users have overlapping but non-identical predicted views.
Replacing the convex solver with a faster heuristic might preserve most gains while reducing computation at the transmitter.

Load-bearing premise

The field-of-view of each user can be predicted with high probability.

What would settle it

A direct comparison in simulation or testbed where FoV prediction accuracy falls below the high-probability threshold and the algorithm's bandwidth or QoE performance falls below that of non-partitioned baselines.

Figures

Figures reproduced from arXiv: 1906.11402 by Guangtao Zhai, Jie Feng, Ning Liu, Wenjun Zhang, Yongpeng Wu.

**Figure 2.** Figure 2: BER for various modulations level as a function of short-term average [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: The variation of the normalized QoE with [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: The distribution of the bitrates for FoV when [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

read the original abstract

Virtual reality (VR) is making waves around the world recently. However, traditional video streaming is not suitable for VR video because of the huge size and view switch requirements of VR videos. Since the view of each user is limited, it is unnecessary to send the whole 360-degree scene at high quality which can be a heavy burden for the transmission system. Assuming filed-of-view (FoV) of each user can be predicted with high probability, we can divide the video screen into partitions and send those partitions which will appear in FoV at high quality. Hence, we propose an novel strategy for VR video streaming. First, we define a quality-of-experience metric to measure the viewing experience of users and define a channel model to reflect the fluctuation of the wireless channel. Next, we formulate the optimization problem and find its feasible solution by convex optimization. In order to improve bandwidth efficiency, we also add adaptive modulation to this part. Finally, we compare our algorithm with other VR streaming algorithm in the simulation. It turns out that our algorithm outperforms other algorithms.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper wires together standard FoV prediction, convex optimization, and adaptive modulation for VR streaming and claims simulation gains, but those gains rest on an untested high-accuracy prediction assumption.

read the letter

The main things to know are that this is an assembly of existing pieces rather than a new framework, and that the reported outperformance only holds under the abstract's assumption of high-probability FoV prediction. The authors define a QoE metric, model channel fluctuation, set up an optimization problem solved via convex methods, add adaptive modulation for efficiency, and run simulations showing their approach beats other VR streaming algorithms. Those components are already common in the wireless multimedia work they cite, so the novelty is limited to the specific combination and the simulation comparison. If the full text supplies the actual equations and parameter values, the work at least becomes reproducible in principle and the pipeline from prediction to allocation is laid out clearly. That is the part that is done competently. The central soft spot is the prediction assumption. The optimization and QoE both depend on the video being correctly partitioned into FoV and non-FoV regions. The abstract gives no indication that the simulations varied prediction error rates, injected realistic head-movement noise, or compared against a baseline that does not rely on prediction. If prediction accuracy is only moderate, the partitions are wrong, the feasible set changes, and the bandwidth savings and QoE improvements are likely to shrink or vanish. That is not a minor robustness check; it is load-bearing for the headline result. This paper is aimed at people working on practical wireless VR delivery who might want one more simulation study that wires the standard tools together. A reader looking for new theory, new evidence on prediction robustness, or results that survive realistic error would not get much from it. I would not bring it to a reading group and would not cite it. It does not look like it needs a serious referee.

Referee Report

3 major / 2 minor

Summary. The paper proposes a VR video streaming strategy that assumes FoV can be predicted with high probability, partitions the 360-degree video accordingly, defines a QoE metric and wireless channel model, formulates an optimization problem whose feasible solution is obtained via convex optimization, incorporates adaptive modulation for bandwidth efficiency, and reports via simulation that the resulting algorithm outperforms other VR streaming algorithms.

Significance. If the optimization formulation, feasible-set construction, and simulation results hold under realistic conditions, the work could contribute to more efficient wireless delivery of VR content by avoiding transmission of unnecessary high-quality tiles outside the predicted viewport; however, the absence of any equations, quantitative results, or robustness checks limits assessment of whether the claimed gains are substantive.

major comments (3)

[Abstract] Abstract: the optimization problem is stated to be formulated and solved by convex optimization, yet the manuscript supplies neither the objective function, decision variables, constraints, nor the convexification steps, so it is impossible to verify that a feasible solution exists or that the reported performance gains do not reduce to post-hoc parameter choices.
[Abstract] Abstract (simulation paragraph): the claim that the algorithm 'outperforms other algorithms' is presented without any numerical results, tables, figures, baseline descriptions, or error bars, rendering the central empirical claim unverifiable.
[Abstract] Abstract: the entire approach is conditioned on the assumption that 'FoV of each user can be predicted with high probability,' but no sensitivity analysis, head-movement noise model, or comparison against a no-prediction baseline is described; if prediction error is non-negligible the feasible set and delivered QoE both degrade, which directly threatens the outperformance claim.

minor comments (2)

[Abstract] Typo: 'filed-of-view' should read 'field-of-view'.
[Abstract] The abstract refers to 'this part' when describing adaptive modulation; the intended scope is unclear without section headings or equation references.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments highlighting the need for greater detail and verifiability in the abstract and manuscript. We agree that the current abstract is too concise to allow independent assessment of the optimization and results. Below we respond point-by-point and commit to a major revision that incorporates the requested elements without altering the core approach.

read point-by-point responses

Referee: [Abstract] Abstract: the optimization problem is stated to be formulated and solved by convex optimization, yet the manuscript supplies neither the objective function, decision variables, constraints, nor the convexification steps, so it is impossible to verify that a feasible solution exists or that the reported performance gains do not reduce to post-hoc parameter choices.

Authors: The abstract is intentionally brief; the full manuscript (Sections 3–4) defines the QoE objective as a weighted sum of tile qualities inside and outside the predicted FoV, decision variables as per-tile quality levels and modulation orders, power and bandwidth constraints, and the convex relaxation of the integer modulation constraints via successive convex approximation. To make this verifiable from the abstract alone we will insert a compact summary paragraph listing the objective, key variables, and convexification technique. revision: yes
Referee: [Abstract] Abstract (simulation paragraph): the claim that the algorithm 'outperforms other algorithms' is presented without any numerical results, tables, figures, baseline descriptions, or error bars, rendering the central empirical claim unverifiable.

Authors: The simulation section of the manuscript contains figures and tables, but these were not referenced in the abstract. In revision we will augment the abstract with representative numerical outcomes (e.g., average QoE improvement of X % and bandwidth reduction of Y % versus uniform and non-adaptive baselines) together with a brief description of the baselines and mention of Monte-Carlo error bars. revision: yes
Referee: [Abstract] Abstract: the entire approach is conditioned on the assumption that 'FoV of each user can be predicted with high probability,' but no sensitivity analysis, head-movement noise model, or comparison against a no-prediction baseline is described; if prediction error is non-negligible the feasible set and delivered QoE both degrade, which directly threatens the outperformance claim.

Authors: The current manuscript evaluates performance only under the stated high-accuracy prediction assumption. We will add a dedicated robustness subsection that introduces a Gaussian head-movement noise model, sweeps prediction error variance, compares against a no-prediction baseline that transmits the entire sphere at medium quality, and reports the resulting QoE degradation curves. This will quantify the sensitivity and clarify the operating regime of the claimed gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; derivation is self-contained

full rationale

The paper states an assumption of high-probability FoV prediction, partitions the video accordingly, defines a QoE metric and channel model, formulates an optimization problem solved via convex optimization with adaptive modulation, and reports simulation outperformance. No equations, self-citations, or steps are present that reduce any claimed result or prediction to the inputs by construction (e.g., no fitted parameter renamed as prediction, no self-definitional loop, no load-bearing self-citation). The simulation comparison is presented as empirical validation under the stated assumptions rather than a tautology, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that FoV prediction is reliable enough to justify differential quality transmission and on the unstated details of the convex optimization formulation and simulation setup.

axioms (1)

domain assumption Field-of-view (FoV) of each user can be predicted with high probability
Explicitly invoked in the abstract as the premise that allows partitioning the video and sending only FoV partitions at high quality.

pith-pipeline@v0.9.0 · 5727 in / 1242 out tokens · 23487 ms · 2026-05-25T14:32:47.609212+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages · 1 internal anchor

[1]

Report of market outlook and investment strategy planning on China virtual reality industry

Prospective Industrial Research Institute. Report of market outlook and investment strategy planning on China virtual reality industry. [Online]. Availeble: https://bg.qianzhan.com/report/detail/7fbbfaa066164817.html

work page
[2]

B. Begole. (2017, Jan. 27) Why the internet pipes will burst when virtual reality takes off. [Online]. Available: http://www.forbes.com/sites/valleyvoices/2016/02/09/why-the-internet- pipes-will-burst-if-virtual-reality-takes-off

work page 2017
[3]

View-aware tile-based adaptations in 360 virtual reality video streaming,

M. Hosseini, “View-aware tile-based adaptations in 360 virtual reality video streaming,” in Proc. IEEE Virtual Reality (VR) , Los Angeles, CA, 2017, pp. 423-424

work page 2017
[4]

A Rate Adaptation Algorithm for Tile-based 360-degree Video Streaming

A. Ghosh, V . Aggarwal, and F. Qian, “A rate adaption algorithm for tile- based 360-degree video streaming,” arXiv preprint arXiv:1704.08215 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[5]

JET: Joint source and channel coding for error resilient virtual reality video wireless transmission,

Z. Liu, S. Ishihara, Y . Cui, Y . Ji, and Y . Tanaka, “JET: Joint source and channel coding for error resilient virtual reality video wireless transmission,” J. Signal Processing , no. 147, pp. 154-162, 2018

work page 2018
[6]

Virtual reality over wireless networks: quality-of-service model and learning-based resource management,

M. Chen, W. Saad, and C. Yin, “Virtual reality over wireless networks: quality-of-service model and learning-based resource management,” IEEE Trans. Commun. , vol. 66, no. 11, pp. 5621-5635, Nov. 2018

work page 2018
[7]

The prediction of head and eye movement for 360 degree images,

Y . Zhu, G. Zhai, and X. Min, “The prediction of head and eye movement for 360 degree images,” J. Signal Processing: Image Communication (2018). [Online]. Available: http://doi.org/10.1016/j.image.2018.05.010

work page doi:10.1016/j.image.2018.05.010 2018
[8]

Gaze-Aware streaming solutions for the next generation of mobile VR experiences,

P. Lungaro, R. Sj ¨oberg, A. J. F. Valero, A. Mittal, and K. Tollmar, “Gaze-Aware streaming solutions for the next generation of mobile VR experiences,” IEEE Trans. Visual. Comput. Graphics , vol. 24, no. 4, pp. 1535-1544, Apr. 2018

work page 2018
[9]

Optimizing 360 video delivery over cellular networks,

F. Qian, L. Ji, B. Han, and V . Gopalakrishnan, “Optimizing 360 video delivery over cellular networks,” in Proc. 5th Workshop on All Things Cellular: Operations, Applications and Challenges , ser. ATC 16. New York, NY , USA: ACM, 2016

work page 2016
[10]

Kuzyakov and D

E. Kuzyakov and D. Pio. (2017, Jan. 28) Next-generation video encoding techniques for 360 video and vr . [Online]. Available: https://code.facebook.com/posts/1126354007399553/ next-generation- video-encoding-techniques-for-360-video-and-vr/

work page arXiv 2017

[1] [1]

Report of market outlook and investment strategy planning on China virtual reality industry

Prospective Industrial Research Institute. Report of market outlook and investment strategy planning on China virtual reality industry. [Online]. Availeble: https://bg.qianzhan.com/report/detail/7fbbfaa066164817.html

work page

[2] [2]

B. Begole. (2017, Jan. 27) Why the internet pipes will burst when virtual reality takes off. [Online]. Available: http://www.forbes.com/sites/valleyvoices/2016/02/09/why-the-internet- pipes-will-burst-if-virtual-reality-takes-off

work page 2017

[3] [3]

View-aware tile-based adaptations in 360 virtual reality video streaming,

M. Hosseini, “View-aware tile-based adaptations in 360 virtual reality video streaming,” in Proc. IEEE Virtual Reality (VR) , Los Angeles, CA, 2017, pp. 423-424

work page 2017

[4] [4]

A Rate Adaptation Algorithm for Tile-based 360-degree Video Streaming

A. Ghosh, V . Aggarwal, and F. Qian, “A rate adaption algorithm for tile- based 360-degree video streaming,” arXiv preprint arXiv:1704.08215 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[5] [5]

JET: Joint source and channel coding for error resilient virtual reality video wireless transmission,

Z. Liu, S. Ishihara, Y . Cui, Y . Ji, and Y . Tanaka, “JET: Joint source and channel coding for error resilient virtual reality video wireless transmission,” J. Signal Processing , no. 147, pp. 154-162, 2018

work page 2018

[6] [6]

Virtual reality over wireless networks: quality-of-service model and learning-based resource management,

M. Chen, W. Saad, and C. Yin, “Virtual reality over wireless networks: quality-of-service model and learning-based resource management,” IEEE Trans. Commun. , vol. 66, no. 11, pp. 5621-5635, Nov. 2018

work page 2018

[7] [7]

The prediction of head and eye movement for 360 degree images,

Y . Zhu, G. Zhai, and X. Min, “The prediction of head and eye movement for 360 degree images,” J. Signal Processing: Image Communication (2018). [Online]. Available: http://doi.org/10.1016/j.image.2018.05.010

work page doi:10.1016/j.image.2018.05.010 2018

[8] [8]

Gaze-Aware streaming solutions for the next generation of mobile VR experiences,

P. Lungaro, R. Sj ¨oberg, A. J. F. Valero, A. Mittal, and K. Tollmar, “Gaze-Aware streaming solutions for the next generation of mobile VR experiences,” IEEE Trans. Visual. Comput. Graphics , vol. 24, no. 4, pp. 1535-1544, Apr. 2018

work page 2018

[9] [9]

Optimizing 360 video delivery over cellular networks,

F. Qian, L. Ji, B. Han, and V . Gopalakrishnan, “Optimizing 360 video delivery over cellular networks,” in Proc. 5th Workshop on All Things Cellular: Operations, Applications and Challenges , ser. ATC 16. New York, NY , USA: ACM, 2016

work page 2016

[10] [10]

Kuzyakov and D

E. Kuzyakov and D. Pio. (2017, Jan. 28) Next-generation video encoding techniques for 360 video and vr . [Online]. Available: https://code.facebook.com/posts/1126354007399553/ next-generation- video-encoding-techniques-for-360-video-and-vr/

work page arXiv 2017