pith. sign in

arxiv: 2604.23611 · v1 · submitted 2026-04-26 · 💻 cs.IT · math.IT

DRL-Based Antenna Position Optimization For MA-Assisted OTFS System Under Imperfect CSI

Pith reviewed 2026-05-08 05:11 UTC · model grok-4.3

classification 💻 cs.IT math.IT
keywords movable antennaOTFSdeep reinforcement learningchannel estimationimperfect CSIantenna position optimizationsparse Bayesian learning
0
0 comments X

The pith

Movable-antenna positions optimized by deep reinforcement learning on estimated CSI deliver substantially higher channel gains than fixed antennas in OTFS systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that movable antennas can be repositioned at wavelength scale inside an OTFS link to avoid deep fades even when channel state information is imperfect. It first recovers the channel with a sparse Bayesian learning variational-inference estimator that outperforms standard benchmarks. It then frames antenna placement as a non-convex gain-maximization problem and solves it with a deep reinforcement learning agent that learns effective positions directly from the noisy estimates. Simulations confirm that the resulting placements produce markedly larger instantaneous channel gains than any fixed-position antenna while the estimator itself remains accurate enough to support the optimization.

Core claim

By combining a sparse Bayesian learning variational inference estimator with a deep reinforcement learning policy, the system obtains sufficiently reliable channel estimates to optimize movable-antenna locations and thereby achieves substantially higher channel gains than a conventional fixed-position antenna in OTFS transmission under imperfect CSI.

What carries the argument

Deep reinforcement learning agent that maps SBLVI-estimated CSI to movable-antenna position adjustments in order to maximize the OTFS channel gain.

If this is right

  • The SBLVI estimator improves channel estimation accuracy over conventional methods in OTFS.
  • DRL-based position optimization converts estimated CSI into antenna locations that mitigate deep fading.
  • The overall MA-assisted OTFS architecture outperforms fixed-antenna baselines even without perfect channel knowledge.
  • Single-antenna hardware can adapt its effective location to instantaneous channel conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same DRL policy could be extended to joint optimization of multiple movable antennas or to other high-mobility waveforms.
  • If the estimator and optimizer remain stable at higher velocities, the approach would reduce the need for dense fixed arrays in vehicular or satellite links.
  • Real-time implementation would require checking whether the learning agent can track channel changes within the OTFS frame duration.

Load-bearing premise

The channel estimates produced by the sparse Bayesian learning method are accurate enough that the reinforcement-learning optimizer can find antenna positions that reliably outperform a fixed antenna.

What would settle it

A controlled simulation or over-the-air measurement in which, under the same imperfect-CSI conditions, the DRL-optimized movable-antenna positions produce channel gains no better than those of a fixed-position antenna.

Figures

Figures reproduced from arXiv: 2604.23611 by Deqiang Wang, Maoyuan Wang, Qian Zhang, Xuejun Cheng, Yong Liang Guan, Yufei Zhao, Zheng Dong.

Figure 2
Figure 2. Figure 2: Discrete baseband model of the OTFS system for view at source ↗
Figure 5
Figure 5. Figure 5: Channel gain heatmap with MA and FPA positions in two d view at source ↗
Figure 6
Figure 6. Figure 6: NMSE comparison in two different environments for view at source ↗
read the original abstract

In this paper, we introduce movable antenna (MA) technology into orthogonal time frequency space (OTFS) systems to enable wavelength-level antenna position optimization under imperfect channel state information (CSI), thereby mitigating deep fading. To accurately acquire CSI, we develop a sparse Bayesian learning method with variational inference (SBLVI) method. Based on estimated CSI, we formulate an MA position optimization problem with the objective of maximizing channel gain. Due to the highly non-convex character of the problem, we further develop a deep reinforcement learning (DRL) strategy to intelligently optimize MA positions. Simulation results show that the proposed SBLVI method significantly improves channel estimation accuracy over benchmark methods, and MA position optimization based on estimated CSI achieves substantially higher channel gains than the fixed-position antenna (FPA), demonstrating the effectiveness of the proposed MA-assisted OTFS system.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces movable antenna (MA) technology into OTFS systems to enable wavelength-level position optimization under imperfect CSI for mitigating deep fading. It develops a sparse Bayesian learning with variational inference (SBLVI) method for CSI estimation, formulates a non-convex optimization problem to maximize channel gain based on the estimated CSI, and solves it via a deep reinforcement learning (DRL) strategy. Simulations are reported to show that SBLVI improves estimation accuracy over benchmarks and that MA optimization achieves substantially higher channel gains than fixed-position antennas (FPA).

Significance. If the simulation claims hold under rigorous validation, the work could advance practical high-mobility communications by combining MA positioning with OTFS and handling imperfect CSI via DRL, offering a pathway to more robust links in dynamic environments.

major comments (2)
  1. [Simulation Results] Simulation Results section: The abstract reports improvements in estimation accuracy and channel gain but provides no details on simulation parameters, baselines for SBLVI and DRL, number of Monte Carlo trials, error bars, or DRL training validation (e.g., convergence plots or reward metrics). This leaves the central claim of substantial gains dependent on unreported experimental design.
  2. [Problem Formulation and DRL] Problem Formulation and DRL sections: No comparison is provided between MA position optimization performance under perfect CSI versus SBLVI-estimated CSI, nor any quantification of degradation due to residual estimation errors in the delay-Doppler domain. This is load-bearing for the claim that DRL reliably solves the non-convex problem into positions outperforming FPA under imperfect CSI, as residual errors could map to suboptimal locations failing to avoid fades.
minor comments (1)
  1. [Abstract] The abstract could more precisely state the OTFS modulation parameters and mobility scenarios used to contextualize the SBLVI and DRL results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and recommendation for major revision. We have addressed each point below and will incorporate revisions to enhance reproducibility and strengthen the analysis of the proposed approach under imperfect CSI.

read point-by-point responses
  1. Referee: [Simulation Results] Simulation Results section: The abstract reports improvements in estimation accuracy and channel gain but provides no details on simulation parameters, baselines for SBLVI and DRL, number of Monte Carlo trials, error bars, or DRL training validation (e.g., convergence plots or reward metrics). This leaves the central claim of substantial gains dependent on unreported experimental design.

    Authors: We agree that the Simulation Results section requires expanded details for full reproducibility and to rigorously support the reported gains. In the revised manuscript, we will add a dedicated table of all simulation parameters (including carrier frequency, subcarrier spacing, number of delay-Doppler bins, path loss model, and SNR ranges), explicitly list the baselines (SBLVI compared against LS and MMSE estimators; DRL compared against random positioning and a gradient-based optimizer), specify the number of Monte Carlo trials (10,000), include error bars (standard deviation) on all performance curves, and append DRL training validation figures showing reward convergence and average episode returns over training episodes. These additions will directly address the experimental design concerns. revision: yes

  2. Referee: [Problem Formulation and DRL] Problem Formulation and DRL sections: No comparison is provided between MA position optimization performance under perfect CSI versus SBLVI-estimated CSI, nor any quantification of degradation due to residual estimation errors in the delay-Doppler domain. This is load-bearing for the claim that DRL reliably solves the non-convex problem into positions outperforming FPA under imperfect CSI, as residual errors could map to suboptimal locations failing to avoid fades.

    Authors: We acknowledge the value of this comparison for validating robustness. Although the manuscript centers on practical imperfect-CSI operation, the revised version will include new simulation results in the Simulation Results section that directly compare optimized channel gains under perfect CSI and SBLVI-estimated CSI. We will quantify degradation by reporting the relative loss in gain (as a percentage) and by analyzing how residual delay-Doppler errors affect position selection. The DRL policy, trained end-to-end on estimated CSI, will be shown to select positions that remain effective despite these errors, consistently outperforming FPA; a brief discussion of the error propagation in the delay-Doppler domain will be added to the Problem Formulation section. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical validation of SBLVI + DRL optimization stands independent of inputs

full rationale

The derivation proceeds as: (1) SBLVI estimates CSI from OTFS pilots, (2) channel-gain maximization is posed as a non-convex function of MA positions given the estimate, (3) DRL is applied to search for positions, (4) Monte-Carlo simulations compare resulting gains against FPA and other estimators. None of these steps reduce by construction to the inputs; the reported superiority is an empirical outcome that could have been falsified by the simulations. No self-citations, uniqueness theorems, or ansatzes are invoked to force the result. The chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard assumptions about wireless channel sparsity and the ability of variational inference and DRL to handle estimation and non-convex optimization; no explicit free parameters, invented entities, or ad-hoc axioms are stated.

pith-pipeline@v0.9.0 · 5458 in / 1148 out tokens · 54998 ms · 2026-05-08T05:11:21.756171+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    Vision, application scenarios, and key technology trend s for 6G mobile communications,

    Z. Wang et al., “Vision, application scenarios, and key technology trend s for 6G mobile communications,” Science China Inf. Sci. , vol. 65, no. 5, pp. 151–301, 2022

  2. [2]

    Pilot design and optimization for OTFS modulation,

    S. Wang, J. Guo, X. Wang, W. Y uan, and Z. Fei, “Pilot design and optimization for OTFS modulation,” IEEE Wireless Commun. Lett. , vol. 10, no. 8, pp. 1742–1746, 2021

  3. [3]

    A unifying view of OTFS and its many variants,

    Q. Deng et al., “A unifying view of OTFS and its many variants,” IEEE Commun. Surv. Tutor ., vol. 27, no. 6, pp. 3561–3586, 2025

  4. [4]

    Uplink-aided high mo- bility downlink channel estimation over massive MIMO-OTFS system,

    Y . Liu, S. Zhang, F. Gao, J. Ma, and X. Wang, “Uplink-aided high mo- bility downlink channel estimation over massive MIMO-OTFS system,” IEEE J. Sel. Areas Commun. , vol. 38, no. 9, pp. 1994–2009, 2020. -1 -0.5 0 0.5 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1 1.5 2 2.5 3 3.5 4 (a) The car speed v = 40 km/h -1 -0.5 0 0.5 1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 ...

  5. [5]

    Modeling and performance ana lysis for movable antenna enabled wireless communications,

    L. Zhu, W. Ma, and R. Zhang, “Modeling and performance ana lysis for movable antenna enabled wireless communications,” IEEE Trans. Wireless Commun., vol. 23, no. 6, pp. 6234–6250, 2024

  6. [6]

    An efficient sum-rate maximization algorithm for fluid ante nna-assisted ISAC system,

    Q. Zhang, M. Shao, T. Zhang, G. Chen, J. Liu, and P . C. Ching , “An efficient sum-rate maximization algorithm for fluid ante nna-assisted ISAC system,” IEEE Commun. Lett. , vol. 29, no. 1, pp. 200–204, 2025

  7. [7]

    Latency minimization for movable relay-aided D2D-MEC communication systems,

    Y . Xiu et al., “Latency minimization for movable relay-aided D2D-MEC communication systems,” IEEE Trans. Mob. Comput. , vol. 25, no. 1, pp. 533–549, 2026

  8. [8]

    Movable antennas for wireles s commu- nication: Opportunities and challenges,

    L. Zhu, W. Ma, and R. Zhang, “Movable antennas for wireles s commu- nication: Opportunities and challenges,” IEEE Commun. Mag. , vol. 62, no. 6, pp. 114–120, 2023

  9. [9]

    Movable antenna enhanced wir eless sensing via antenna position optimization,

    W. Ma, L. Zhu, and R. Zhang, “Movable antenna enhanced wir eless sensing via antenna position optimization,” IEEE Trans. Wireless Com- mun., vol. 23, no. 11, pp. 16 575–16 589, 2024

  10. [10]

    Channel estimation for movable antenna communication systems: A framework based on compressed sensing,

    Z. Xiao et al., “Channel estimation for movable antenna communication systems: A framework based on compressed sensing,” IEEE Trans. Wireless Commun., vol. 23, no. 9, pp. 11 814–11 830, 2024

  11. [11]

    Multi-beam forming with mov able- antenna array,

    W. Ma, L. Zhu, and R. Zhang, “Multi-beam forming with mov able- antenna array,” IEEE Commun. Lett. , vol. 28, no. 3, pp. 697–701, 2024

  12. [12]

    Movable-antenna array enha nced beam- forming: Achieving full array gain with null steering,

    L. Zhu, W. Ma, and R. Zhang, “Movable-antenna array enha nced beam- forming: Achieving full array gain with null steering,” IEEE Commun. Lett., vol. 27, no. 12, pp. 3340–3344, 2023

  13. [13]

    Movable antenna-aided cooperative ISAC network with time synchronization error and imperfect CSI,

    Y . Xiu et al. , “Movable antenna-aided cooperative ISAC network with time synchronization error and imperfect CSI,” IEEE Trans. Commun. , vol. 74, pp. 2968–2983, 2025

  14. [14]

    Movable-antenna en hanced multiuser communication via antenna position optimizatio n,

    L. Zhu, W. Ma, B. Ning, and R. Zhang, “Movable-antenna en hanced multiuser communication via antenna position optimizatio n,” IEEE Trans. Wireless Commun. , vol. 23, no. 7, pp. 7214–7229, 2024

  15. [15]

    Robust optimization for movable antenna-aided cell-fre e ISAC with time synchronization errors,

    Y . Xiu et al. , “Robust optimization for movable antenna-aided cell-fre e ISAC with time synchronization errors,” IEEE Trans. Wireless Commun., vol. 25, pp. 10 082–10 097, 2026

  16. [16]

    Movable-a ntenna po- sition optimization: A graph-based approach,

    W. Mei, X. Wei, B. Ning, Z. Chen, and R. Zhang, “Movable-a ntenna po- sition optimization: A graph-based approach,” IEEE Wireless Commun. Lett., vol. 13, no. 7, pp. 1853–1857, 2024

  17. [17]

    Multiuse r commu- nications with movable-antenna base station: Joint antenn a positioning, receive combining, and power control,

    Z. Xiao, X. Pi, L. Zhu, X.-G. Xia, and R. Zhang, “Multiuse r commu- nications with movable-antenna base station: Joint antenn a positioning, receive combining, and power control,” IEEE Trans. Wireless Commun., vol. 23, no. 12, pp. 19 744–19 759, 2024

  18. [18]

    Deep learning for m ovable antenna precoding in 2D MISO communication system,

    C. Xie, Y . Xiu, S. Y ang, and Z. Zhang, “Deep learning for m ovable antenna precoding in 2D MISO communication system,” in Proc. IEEE Global Commun. Conf. , Chengdu, China, 2024, pp. 2500–2504

  19. [19]

    Compressed sensing based ch annel estimation for movable antenna communications,

    W. Ma, L. Zhu, and R. Zhang, “Compressed sensing based ch annel estimation for movable antenna communications,” IEEE Commun. Lett. , vol. 27, no. 10, pp. 2747–2751, 2023