A bi-level priority sorting framework for flexible AGV service scheduling in smart warehouses
Pith reviewed 2026-05-10 10:10 UTC · model grok-4.3
The pith
A bi-level priority sorting framework for AGV scheduling in warehouses reduces average order delays and total system costs by over 50 percent during peak demand while keeping service levels above 90 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that its bi-level priority sorting framework, which dynamically adjusts scheduling parameters at the first level and performs real-time routing optimization with the priority-deadline-shortest-path and delay-cost-shortest-path heuristics plus A* guided deep Q-learning at the second level, reduces average order delay and total system costs by over 50 percent during peak demand periods while maintaining a service level above 90 percent and maximizing AGV utilization across diverse layouts and demand scenarios.
What carries the argument
The bi-level priority sorting framework, where the first level adjusts scheduling parameters based on predefined customer priority categories and the second level optimizes AGV paths with conflict avoidance, heuristic rules for multi-capacity tasks, and reinforcement learning refinement.
If this is right
- The framework sustains high service levels and efficiency under both normal and fluctuating demand patterns.
- AGV utilization improves without proportional rises in overall system costs.
- The priority sorting mechanism delivers performance gains relative to other reinforcement learning baselines.
- The same structure supports flexible operations for complex multi-capacity package picking tasks.
Where Pith is reading between the lines
- Similar bi-level priority adjustments could be tested in other multi-vehicle logistics settings such as port container handling or hospital supply transport.
- Integration with live sensor data from warehouse floors might further improve the routing level beyond the current simulation results.
- If the cost reductions hold, operators could meet rising order volumes with fewer vehicles than current planning methods require.
Load-bearing premise
The simulation experiments across diverse warehouse layouts and order demand patterns accurately represent real-world conditions, and the heuristics and RL training generalize beyond the tested scenarios.
What would settle it
Running the framework on physical AGVs in an operational smart warehouse during an actual peak-demand period and checking whether average order delays and total costs drop by more than 50 percent while the service level stays above 90 percent.
read the original abstract
This paper proposes a bi-level optimization framework to coordinate Automated Guided Vehicle (AGV) flexible operations in smart independent warehouses, addressing the critical challenge of balancing high-throughput order fulfillment with stringent cost control. The framework is designed to simultaneously optimize flexible customer service level, system cost, and operational efficiency. The first level dynamically adjusts real-time scheduling parameters, such as order commitment times and delay tolerance, based on predefined customer priority categories. The second level performs real-time routing optimization for each AGV by identifying the shortest feasible paths while avoiding conflicts. For complex multi-capacity package picking tasks, two heuristic rules, priority, deadline, with shortest path (PDSP) and delay cost with shortest path (DCSP), are applied to multi-capacity package picking tasks and further training is carried out using the reinforcement learning algorithm of A* guided deep Q-learning (AGDQN). Comprehensive simulation experiments, conducted across diverse warehouse layouts and order demand patterns, demonstrate that the proposed framework equipped with both heuristic rules consistently reduces average order delay and total system costs by over 50% during peak demand periods. This is achieved while maintaining a service level above 90% and maximizing AGV utilization. The method also exhibits superior flexibility and sustained efficiency under normal and fluctuating demand scenarios. Additional ablation studies confirm that the proposed priority sorting mechanism delivers robust performance advantages when tested with various other reinforcement learning baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a bi-level optimization framework for coordinating flexible AGV operations in smart warehouses. The first level dynamically adjusts scheduling parameters (e.g., order commitment times and delay tolerance) according to customer priority categories. The second level performs real-time routing optimization using shortest feasible paths with conflict avoidance; for multi-capacity picking, it applies the PDSP and DCSP heuristics and trains them via A* guided deep Q-learning (AGDQN). Comprehensive simulations across diverse layouts and demand patterns are reported to show that the framework with both heuristics consistently reduces average order delay and total system costs by over 50% in peak periods while keeping service level above 90% and maximizing AGV utilization.
Significance. If the reported simulation improvements prove robust, the bi-level priority-sorting approach combined with guided RL offers a practical method for balancing service level, cost, and throughput in warehouse AGV scheduling. The explicit separation of priority-based parameter tuning from conflict-aware routing, together with the two heuristics and AGDQN component, provides a concrete, implementable architecture that addresses a real operational problem. The manuscript does not, however, supply the detailed experimental protocol needed to assess whether these gains are statistically reliable or sensitive to modeling assumptions.
major comments (2)
- [Abstract / simulation experiments] Abstract and simulation-experiments section: the central claim that the framework 'consistently reduces average order delay and total system costs by over 50% during peak demand periods' is presented without any information on the number of independent replications, variance estimates, confidence intervals, statistical tests, or the precise baseline algorithms against which the 50% figure is measured. This omission directly undermines evaluation of the performance assertions.
- [simulation experiments] Simulation-experiments section: the warehouse simulator (AGV dynamics, multi-capacity picking times, conflict avoidance rules, and synthetic demand generation) is described only at a high level. No sensitivity analysis is reported for key parameters such as AGV speed variance, order-size distribution, or layout regularity, leaving open the possibility that the quoted improvements are tied to the specific modeling choices rather than the framework itself.
minor comments (2)
- [Abstract] The abstract would be clearer if it briefly indicated the range of warehouse layouts (grid sizes, aisle configurations) and demand patterns (Poisson rates, peak-to-normal ratios) used in the experiments.
- [framework description] Notation for the bi-level formulation and the exact definitions of PDSP and DCSP decision rules could be presented with pseudocode or a compact algorithmic box to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments on our manuscript. We agree that greater transparency in the experimental protocol and additional robustness checks will strengthen the presentation of our results. Below we provide point-by-point responses and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract / simulation experiments] the central claim that the framework 'consistently reduces average order delay and total system costs by over 50% during peak demand periods' is presented without any information on the number of independent replications, variance estimates, confidence intervals, statistical tests, or the precise baseline algorithms against which the 50% figure is measured.
Authors: We acknowledge that the manuscript currently omits explicit reporting of replication counts, variance measures, confidence intervals, and formal statistical tests for the performance claims. The 50% reduction figures were obtained from comparisons against standard warehouse scheduling policies (FCFS, SPT) and several RL baselines already referenced in the ablation studies, but these details were not quantified in the text. In the revised manuscript we will add a new subsection to the simulation-experiments section that (i) states the number of independent replications performed, (ii) reports means, standard deviations, and 95% confidence intervals, (iii) includes the results of paired t-tests or Wilcoxon tests for significance, and (iv) explicitly names and briefly describes each baseline algorithm used to compute the 50% improvement. Corresponding updates will also be made to the abstract to ensure consistency. revision: yes
-
Referee: [simulation experiments] the warehouse simulator (AGV dynamics, multi-capacity picking times, conflict avoidance rules, and synthetic demand generation) is described only at a high level. No sensitivity analysis is reported for key parameters such as AGV speed variance, order-size distribution, or layout regularity.
Authors: The current description of the simulator is indeed high-level. We will expand the simulation-experiments section with a more detailed account of AGV kinematic models, multi-capacity picking time distributions, conflict-resolution rules, and the stochastic demand generator. In addition, we will conduct and report sensitivity analyses on AGV speed variance, order-size distributions, and multiple layout regularities (grid, aisle, and irregular topologies). These results will be presented in a new table or figure set to demonstrate that the reported improvements remain consistent across the tested parameter ranges. revision: yes
Circularity Check
No significant circularity; framework and results are independently defined and externally validated
full rationale
The paper defines its bi-level optimization framework, PDSP/DCSP heuristics, and AGDQN training procedure as independent constructs. The central performance claims (over 50% reductions in delay and costs with >90% service level) are presented as outcomes of comprehensive simulation experiments across varied layouts and demand patterns, not as mathematical derivations that reduce to the framework's own inputs or fitted parameters. No equations, self-citations, or uniqueness theorems are invoked in a load-bearing way that would make the results tautological by construction. The derivation chain remains self-contained, with simulations serving as external benchmarks rather than internal redefinitions.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Yan, B., Dong, Q., Li, Q., Yang, L., Amin, F.U.I.: A study on the interaction between logistics industry and manufacturing industry from the perspective of integration field. PLoS One17(3), 0264585 (2022)
work page 2022
-
[2]
Journal of Intelligent & Robotic Systems77(3-4), 525–545 (2015)
Fazlollahtabar, H., Saidi-Mehrabad, M.: Methodologies to optimize automated guided vehicle scheduling and routing problems: A review study. Journal of Intelligent & Robotic Systems77(3-4), 525–545 (2015)
work page 2015
-
[3]
Flexible services and manufacturing journal31(1), 104–141 (2019)
Lin, J.T., Chiu, C.-C., Chang, Y.-H.: Simulation-based optimization approach for simultaneous scheduling of vehicles and machines with processing time uncertainty in fms. Flexible services and manufacturing journal31(1), 104–141 (2019)
work page 2019
-
[4]
Flexible services and manufacturing journal35(2), 376–415 (2023)
Cano, J.A., Cort´ es, P., Mu˜ nuzuri, J., Correa-Espinal, A.: Solving the picker routing problem in multi-block high-level storage systems using metaheuristics. Flexible services and manufacturing journal35(2), 376–415 (2023)
work page 2023
-
[5]
Flexible services and manufacturing journal (2025)
Xie, T., Yu, F., Yang, Y.: Agv scheduling at u-shaped automated container terminals considering dual threshold charging strategy. Flexible services and manufacturing journal (2025)
work page 2025
-
[6]
PloS one14(12), 0226161 (2019)
Liu, Y., Ji, S., Su, Z., Guo, D.: Multi-objective agv scheduling in an automatic sort- ing system of an unmanned (intelligent) warehouse by using two adaptive genetic algorithms and a multi-adaptive genetic algorithm. PloS one14(12), 0226161 (2019)
work page 2019
-
[7]
International 28 Journal of Production Research55(21), 6407–6422 (2017)
Larco, J.A., Koster, R., Roodbergen, K.J., Dul, J.: Managing warehouse efficiency and worker discomfort through enhanced storage assignment decisions. International 28 Journal of Production Research55(21), 6407–6422 (2017)
work page 2017
-
[8]
Flexible services and manufacturing journal37(3), 697– 729 (2025)
Sun, Y., Zhao, N.: Benchmark for multi-agent pickup and delivery problem in a robotic mobile fulfillment system. Flexible services and manufacturing journal37(3), 697– 729 (2025)
work page 2025
-
[9]
Flexible services and manufacturing journal (2025)
Chaikovskaia, M., Gayon, J.-P., Quilliot, A.: Optimization of a fleet of reconfigurable robots. Flexible services and manufacturing journal (2025)
work page 2025
-
[10]
IEEE Transactions on Industrial Informatics19(5), 6365–6376 (2023)
Zhang, W., Zhang, Y., Zhang, H.: A dynamic conflict-free routing algorithm for multi- agv systems based on priority and time windows. IEEE Transactions on Industrial Informatics19(5), 6365–6376 (2023)
work page 2023
-
[11]
IEEE Transactions on automation science and engineering18(2), 638–649 (2020)
Riazi, S., Bengtsson, K., Lennartson, B.: Energy optimization of large-scale agv sys- tems. IEEE Transactions on automation science and engineering18(2), 638–649 (2020)
work page 2020
-
[12]
Vicil, O.: Inventory rationing on a one-for-one inventory model for two priority customer classes. IISE Transactions53(4), 472–495 (2021) Uzunoglu Kocer, U., Yalcin, B.: Continuous review (s, q) inventory system with random lifetime and two demand classes. Opsearch57(1), 104–118 (2020) Hyyti¨ a, E., Penttinen, A., Sulonen, R.: Non-myopic vehicle and route...
work page 2021
-
[13]
Journal of Ambient Intelligence and Humanized Computing10(11), 4533–4546 (2019)
Li, G., Li, X., Gao, L., Zeng, B.: Tasks assigning and sequencing of multiple agvs based on an improved harmony search algorithm. Journal of Ambient Intelligence and Humanized Computing10(11), 4533–4546 (2019)
work page 2019
-
[14]
Computers & Industrial Engineering158, 107397 (2021)
Zhang, Z., Wu, L., Zhang, W., Peng, T., Zheng, J.: Energy-efficient path planning for a single-load automated guided vehicle in a manufacturing workshop. Computers & Industrial Engineering158, 107397 (2021)
work page 2021
-
[15]
International Journal of Advanced Robotic Systems21(2) (2024)
Zhang, T., Xu, M., Ma, Z., Ma, F., Yu, L.: A scheduling optimization method for mul- tiple automated guided vehicle systems. International Journal of Advanced Robotic Systems21(2) (2024)
work page 2024
-
[16]
Computers & Operations Research38(5), 876–888 (2011)
Nishi, T., Hiranaka, Y., Grossmann, I.E.: A bilevel decomposition algorithm for simultaneous production scheduling and conflict-free routing for automated guided vehicles. Computers & Operations Research38(5), 876–888 (2011)
work page 2011
-
[17]
IEEE Access8, 13259–13269 (2020)
Wang, W., Wu, Y., Zheng, J., Chi, C.: A comprehensive framework for the design of modular robotic mobile fulfillment systems. IEEE Access8, 13259–13269 (2020)
work page 2020
- [18]
-
[19]
IET Collaborative Intelligent Manufacturing, 12105 (2024)
Jiang, Z., et al.: Research on vehicle path planning of automated guided vehicle with simultaneous pickup and delivery with mixed time windows. IET Collaborative Intelligent Manufacturing, 12105 (2024)
work page 2024
-
[20]
In: 2023 42nd Chinese Control Conference (CCC), pp
Durst, P., Jia, X., Li, L.: Multi-objective optimization of agv real-time scheduling based on deep reinforcement learning. In: 2023 42nd Chinese Control Conference (CCC), pp. 5535–5540 (2023). IEEE
work page 2023
-
[21]
IEEE access8, 71752–71762 (2020)
Liu, C.-L., Chang, C.-C., Tseng, C.-J.: Actor-critic deep reinforcement learning for solving job shop scheduling problems. IEEE access8, 71752–71762 (2020)
work page 2020
-
[22]
Kocer, U.U., Yalcin, B.: Continuous review (s, q) inventory system with random lifetime and two demand classes. Opsearch57(1), 104–118 (2020)
work page 2020
-
[23]
Production and operations management19(1), 70–82 (2010)
Chen, C.-M., Gong, Y., De Koster, R.B., Van Nunen, J.A.: A flexible evaluative frame- work for order picking systems. Production and operations management19(1), 70–82 (2010)
work page 2010
-
[24]
Annals of Operations Research295(1), 139–161 (2020)
Gupta, M., Tiwari, S., Jaggi, C.K.: Retailer’s ordering policies for time-varying dete- riorating items with partial backlogging and permissible delay in payments in a two-warehouse environment. Annals of Operations Research295(1), 139–161 (2020)
work page 2020
- [25]
- [26]
-
[27]
Journal of Physics: Conference Series1650(3), 032169 (2020)
Gao, Y., Wang, X.: Research on optimization model of dynamic distribution path based on intelligent logistics. Journal of Physics: Conference Series1650(3), 032169 (2020)
work page 2020
- [28]
-
[29]
Anderson, D.J.: Kanban: Successful Evolutionary Change for Your Technology Business. Blue Hole Press, ??? (2012)
work page 2012
-
[30]
International Journal of Project Organisation and Management1(1), 47–64 (2008)
Olsson, N.O.: External and internal flexibility–aligning projects with the busi- ness strategy and executing projects efficiently. International Journal of Project Organisation and Management1(1), 47–64 (2008)
work page 2008
-
[31]
Arshinina, P.: Strategic management for logistics. In: International Scientific Confer- ence Strategic Management and Decision Support Systems in Strategic Management (2019) 30
work page 2019
-
[32]
John Wiley & Sons, Chichester (1995)
Chretienne, P.: Scheduling theory and its applications. John Wiley & Sons, Chichester (1995)
work page 1995
-
[33]
Mathematics of Operations Research2(2), 145–154 (1977)
Goldberg, H.M.: Analysis of the earliest due date scheduling rule in queueing systems. Mathematics of Operations Research2(2), 145–154 (1977)
work page 1977
-
[34]
Jackson, J.R.: Scheduling a production line to minimize maximum tardiness. Tech. rep., University of California, Los Angeles, CA, USA (1955)
work page 1955
-
[35]
Naval Research Logistics Quarterly3(1-2), 59–66 (1956)
Smith, W.E.: Various optimizers for single-stage production. Naval Research Logistics Quarterly3(1-2), 59–66 (1956)
work page 1956
-
[36]
Management Science64(5), 2427–2444 (2018)
Maglaras, C., Yao, J., Zeevi, A.: Optimal price and delay differentiation in large-scale queueing systems. Management Science64(5), 2427–2444 (2018)
work page 2018
-
[37]
European Journal of Operational Research104(1), 201–217 (1998)
Amiri, A.: The design of service systems with queueing time cost, workload capacities, and backup service. European Journal of Operational Research104(1), 201–217 (1998)
work page 1998
-
[38]
Computers & Industrial Engineering139, 105564 (2020)
Kim, T.Y.: Improving warehouse responsiveness by job priority management: A euro- pean distribution centre field study. Computers & Industrial Engineering139, 105564 (2020)
work page 2020
-
[39]
IEEE Transactions on Robotics22(3), 507–522 (2006)
Mei, Y., Lu, Y.-H., Hu, Y.C., Lee, C.S.G.: Deployment of mobile robots with energy and timing constraints. IEEE Transactions on Robotics22(3), 507–522 (2006)
work page 2006
-
[40]
Computers & Industrial Engineering178, 109112 (2023)
Luo, L., Zhao, N., Zhu, Y., Sun, Y.: A* guiding dqn algorithm for automated guided vehicle pathfinding problem of robotic mobile fulfillment systems. Computers & Industrial Engineering178, 109112 (2023)
work page 2023
-
[41]
IEEE Transactions on Industrial Electronics69(8), 8148–8158 (2022)
Chen, X., Li, Y., Wang, M.: Optimized path planning for autonomous guided vehicle through fusion of improved a* and dynamic window approach. IEEE Transactions on Industrial Electronics69(8), 8148–8158 (2022)
work page 2022
-
[42]
Expert Systems with Applications 213, 119947 (2023)
Li, J., Wang, H., Zhang, K., Liu, Y.: A hybrid path planning algorithm combining a* and improved ant colony optimization with dynamic window approach for enhancing energy efficiency in warehouse environments. Expert Systems with Applications 213, 119947 (2023)
work page 2023
-
[43]
IEEE Transactions on Industrial Informatics16(4), 2303– 2314 (2020)
Zhang, Y., Chen, X., Li, W.: Improved a* algorithms for multi-agv path planning in automated warehouses. IEEE Transactions on Industrial Informatics16(4), 2303– 2314 (2020)
work page 2020
-
[44]
Yadav, N.: EDA: E-commerce shipping data. Kaggle Code Notebook (2021). https: //www.kaggle.com/code/niteshyadav3103/eda-e-commerce-shipping-data 31
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.