Genetic Algorithm as Function Optimizer in Reinforcement Learning and Sensor Odometry.
Loading...
Authors
Sehgal, Adarsh
Issue Date
2019
Type
Thesis
Video
Video
Language
Keywords
Deep Reinforcement Learning , Evolutionary Computing , Genetic Algorithm , LIMO , Optimization , Sensor Odometry
Alternative Title
Abstract
Reinforcement learning (RL) enables agents to make a decision based on a reward function. However, in the process of learning, the choice of values for learning algorithm parameters can significantly impact the overall learning process. In this thesis, we use a genetic algorithm (GA) to find the values of parameters used in the Deep Deterministic Policy Gradient (DDPG) combined with Hindsight Experience Replay (HER) algorithm, to help speed up the learning agent. We used this method on fetch-reach, slide, push, pick and place, and door opening in robotic manipulation tasks. Our experimental evaluation shows that our method leads to significantly better performance, faster than the original algorithm. This thesis also deals with Lidar-Monocular Visual Odometry (LIMO), an odometry estimation algorithm, which combines camera and LIght Detection And Ranging sensor (LIDAR) for visual localization by tracking camera features as well as features from LIDAR measurements, and it estimates the motion of sensors using Bundle Adjustment based on robust key frames. For rejecting outliers, LIMO uses semantic labelling and weights of vegetation landmarks. A drawback of LIMO as well as many other odometry estimation algorithms is that they have many parameters that need to be manually adjusted according to dynamic changes in the environment in order to decrease translational errors. In this thesis, we also present and argue the use of Genetic Algorithms to optimize parameters with reference to LIMO and to maximize LIMO's localization and motion estimation performance. We evaluate our approach on the well known KITTI odometry dataset and show that the genetic algorithm helps LIMO to significantly reduce translation error in different datasets.