Proximal Policy Optimization for Continuous Control.

DeepRL2.2A, CS-456 Artificial Neural Networks

Algorithms of Proximal Policy Optimization take a gradient step of maximally possible size. What this means is explained in  this video.