Policy Gradient vs. Action-value based methods Permalink
An empirical analysis on policy-gradient methods vs. action-value based methods for control
An empirical analysis on policy-gradient methods vs. action-value based methods for control
A real-time learning of an off-policy RL control algorithm on a real robot(Create2).