Inverse Reinforcement Learning (IRL) subcategory of robotic learning from demonstration. This paradigm allows operators to program robots via demonstration instead of explicit programming.
We propose the use of Distance Minimization-Inverse Reinforcement Learning (DM-IRL) as a general purpose IRL method. DM-IRL uses an expert judge to assign scores to demonstrations (trajectories) and removes all optimality requirements from the demonstrations. We show that DM-IRL can learn high-quality behavior from extremely sub-optimal demonstrations sourced from multiple demonstrators with unknown transition functions.