主管:中国科学院
主办:中国优选法统筹法与经济数学研究会
   中国科学院科技战略咨询研究院

• •    

考虑需求时间和空间不确定的电动救护车智能重部署

刘嘉, 操洁, 熊毅   

  1. 中南财经政法大学, 430073
  • 收稿日期:2024-11-13 修回日期:2025-09-05 接受日期:2026-01-01
  • 通讯作者: 刘嘉
  • 基金资助:
    国家自然科学基金面上项目(72071212)

Intelligent redeployment of electric ambulances considering uncertain demand time and position

  1. , 430073,
  • Received:2024-11-13 Revised:2025-09-05 Accepted:2026-01-01

摘要: 救护车作为紧急医疗服务(EMS)系统中的稀缺资源,其合理部署对于提高EMS质量至关重要。本文研究电动救护车的智能重部署问题,即在完成急救任务后如何根据动态变化的需求智能地重新选择部署站点,以应对呼叫请求到达率(λ)的不确定性。其中,“智能”指的是通过深度强化学习方法,自动优化重部署决策以应对环境的不确定性和动态变化。现有的主流重部署决策方法通常依赖于对λ的精确估计,然而在高度不确定的环境下,尤其是λ无法准确预测时,这类方法表现不佳。为此,本文提出了一种考虑状态观测不确定性的电动救护车智能重部署动态决策方法。首先,将EMS系统建模为马尔科夫决策过程,并基于近似动态规划压缩系统状态变量,以解决维度灾难问题。其次,本文将不确定性纳入状态变量的观测中,推导出新的策略梯度计算公式,并设计了鲁棒Actor-Critic算法,通过智能学习最优重部署策略。实验结果表明,鲁棒Actor-Critic算法在平均接诊时间和按时接诊率上均表现出显著优势。在此基础上,本文探讨不同充电策略对救护车性能的影响,提出针对电动救护车重部署的管理建议,为优化EMS提供理论支持和实践指导。

关键词: 深度强化学习, 电动救护车, 鲁棒Actor-Critic算法, 重部署, 不确定性环境

Abstract: Ambulance is a scarce resource in the emergency medical service (EMS) system, and the rational deployment of ambulance has been the most important means to improve the quality of EMS. This study addresses the intelligent redeployment of electric ambulances, specifically how to dynamically and intelligently reselect deployment stations after completing an emergency task, in response to the uncertainty in call request arrival rates (λ). In this study, "intelligent" refers to the use of deep reinforcement learning (DRL) methods to automatically optimize redeployment decisions to address environmental uncertainties and dynamic changes. Existing mainstream redeployment decision-making methods typically rely on precise estimates of λ. However, these methods perform poorly in highly uncertain environments, especially when λ cannot be accurately predicted. To address this challenge, this paper proposes a dynamic decision-making method for the intelligent redeployment of electric ambulances, considering the uncertainty in state observations. First, the EMS system is modeled as a Markov decision process, and an approximate dynamic programming approach is used to compress the system's state variables to mitigate the curse of dimensionality. Second, uncertainty is incorporated into the observation of state variables, and a new policy gradient calculation formula is derived. The robust Actor-Critic algorithm is then designed to intelligently learn the optimal redeployment strategy. Experimental results show that the robust Actor-Critic algorithm significantly outperforms existing methods in both average response time and on-time response rate. Building on this, the paper explores the impact of different charging strategies on ambulance performance and provides management recommendations for electric ambulance redeployment, offering theoretical support and practical guidance for optimizing EMS.

Key words: deep reinforcement learning, electric ambulance, robust actor-critic algorithm, redeployment, uncertain environments