irl

Dynamics-Aware Comparison of Learned Reward Functions

We propose a method for quantifying the similarity of learned reward functions without performing policy learning and evaluation.