Review why Q-learning could fail. #24

SBoulanger · 2020-07-16T01:09:01Z

For more information about how and why Q-learning methods can fail, see 1) this classic paper by Tsitsiklis and van Roy, 2) the (much more recent) review by Szepesvari (in section 4.3.2), and 3) chapter 11 of Sutton and Barto, especially section 11.3 (on “the deadly triad” of function approximation, bootstrapping, and off-policy data, together causing instability in value-learning algorithms)

SBoulanger added this to the Learn more about Q-Learning Methods milestone Jul 16, 2020

SBoulanger self-assigned this Jul 16, 2020

SBoulanger added the learning Learn something label Jul 16, 2020

andrewjong modified the milestones: Delve deep into Policy Gradients, Explore MineRL Baselines, Delve into Deep Q-Learning and Inverse RL Jul 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review why Q-learning could fail. #24

Review why Q-learning could fail. #24

SBoulanger commented Jul 16, 2020

Review why Q-learning could fail. #24

Review why Q-learning could fail. #24

Comments

SBoulanger commented Jul 16, 2020