How do eligibility traces arise in policy gradient algorithms?

DeepRL1.4B, CS-456 Artificial Neural Networks

Optimizing the return by policy-gradient in a multi-step environment naturally leads to eligibility traces. A few important mathematical steps are sketched here.