Design of a turret stabilization system using reinforcement learning with external disturbance compensation

Authors

DOI:

https://doi.org/10.37868/sei.v7i2.id635

Abstract

Reinforcement Learning (RL), in particular, Proximal Policy Optimization (PPO), and Twin Delayed Deep Deterministic Policy Gradient (TD3), to stabilize a weapon turret system with a set of external disturbances was considered in this study. Compared to traditional control algorithms such as proportional-integral-derivative (PID) controllers, they have their limits in dynamic settings because of manual tuning and a lack of a durable response to disturbances. This research will compare RL-based controllers and PID in a simulated scenario in which the turret will experience disturbances related to recoil, vibrations, and angular oscillations. A kinematic model of the turret was designed, and Lagrangian mechanics were used to model the disturbances of the turret. The run was done in PyBullet alongside the evaluation of the performance via mean absolute error (MAE) and root mean square error (RMSE). These findings show that RL controllers, particularly PPO and TD3, performed better than the PID one in terms of faster stabilization, reduced errors, and compensation of disturbances. RL agents simulated independently to different patterns of disturbances in a noisy and dynamic environment, and performed better than conventional systems. The results prove that RL-based control systems can be used in real-world applications, especially where accuracy is necessary.

Published

2025-12-03

How to Cite

[1]
A. Dolya, B. Kolumbetov, A. Tulembayev, A. Buldeshov, and S. Nessipova, “Design of a turret stabilization system using reinforcement learning with external disturbance compensation”, Sustainable Engineering and Innovation, vol. 7, no. 2, pp. 619-630, Dec. 2025.

Issue

Section

Articles