2, RL 02: Multiarmed Bandits

Reinforcement Learning

15 March 2023, Thomas Koller, 94 views

Multiarmed bandits. Using exploration and exploitation. Epsilon-greedy actions. Updating the value function.

Viewable by everyone. All rights reserved.