Reinforcement Learning Framework for Combinatorial Optimization Problem Application to Dynamic Weapon Target Assignment

Description

This research presents a Reinforcement Learning (RL) framework for the Dynamic Weapon Target Assignment (DWTA) problem, a combinatorial optimization problem with military applications. The DWTA is an extension of the static Weapon Target Assignment problem (WTA), incorporating time-dependent elements to…

This research presents a Reinforcement Learning (RL) framework for the Dynamic Weapon Target Assignment (DWTA) problem, a combinatorial optimization problem with military applications. The DWTA is an extension of the static Weapon Target Assignment problem (WTA), incorporating time-dependent elements to model the dynamic nature of warfare. Traditional approaches to WTA include simplification, exact algorithms, and heuristic methods. These methods face scalability and computational complexity challenges. This research introduces a mathematical model for DWTA that incorporates time stages, allowing for strategic planning over multiple time stages. The model is formulated as a nonlinear integer programming problem with constraints ensuring the feasibility of weapon assignments over time. To tackle the computational challenges of large-scale DWTA, the paper employs Deep Reinforcement Learning (DRL) algorithms, specifically Deep Q-Network (DQN) and Actor-Critic (AC), to learn efficient policies for weapon assignment. The proposed RL framework is evaluated on various problem instances, demonstrating its ability to provide viable solutions with reasonable inference time, making it suitable for time-efficient applications. The results show that the RL approach outperforms the exact algorithm using constraint programming, especially as the problem size increases, highlighting its potential for practical implementation in DWTA problems.