Abstract
Artificial intelligence, in particular machine learning, is becoming increasingly important in automation and robotics. Machine learning approaches are also becoming more and more accepted in aviation. In particular, Reinforcement Learning is gaining more attention in navigation and control problems, for example in training flight manoeuvres. This paper aims to investigate the use of Off-Policy Reinforcement Learning techniques for three-dimensional waypoint navigation of multicopters by providing roll, pitch and throttle commands. It describes and compare the trainings performed using two well-known Off-Policy algorithms, namely the Deep Deterministic Policy Gradient (DDPG) and the Soft Actor Critic (SAC). Furthermore, we investigate the impact of the reward definition on the training outcome. For each of the used algorithm, two agents are trained with two different reward definitions. Finally, the paper shows the validations performed to evaluate the performance of the four trained agents under different known and unknown conditions. Their performances are evaluated and compared with respect to the training algorithm and the reward definition used.
Original language | English |
---|---|
Title of host publication | 2022 International Conference on Unmanned Aircraft Systems (ICUAS) |
Pages | 1359-1366 |
Number of pages | 8 |
Publication status | Published - 2022 |
Event | International Conference on Unmanned Aircraft System (ICUAS) - Duration: 21 Jun 2022 → 24 Jun 2022 |
Conference
Conference | International Conference on Unmanned Aircraft System (ICUAS) |
---|---|
Period | 21/06/22 → 24/06/22 |
Research Field
- Assistive and Autonomous Systems
Keywords
- Reinforcement Learning
- Waypoint navigation