Course outline

Week -1: Course outline

Slides NL Getting Python / conda virtual environment up and running

Week 0: Programming with Python

Datacamp cursus: Introduction to Python (including numpy)

Object oriented programming a.k.a. Classes

Follow this tutorial:

Python Classes

Extra info on Inheritance:

Python Inheritance
Exercise: Classes & Inheritance

Week 1: Introduction to RL

Read first chapter "Introduction" of Sutton & Barto
Datacamp tutorial Python Modules
Exercise: Tic-tac-toe
Optional: Watch Lecture 1 of David Silver (1,5 hours)

Week 2: Multi-armed bandits

Bandits are MDP with just one state. Example: pick an advertisement to show, reward when clicked. Example: pick a market, reward is units sold in a market.

Read second chapter "Multi armed bandits" of Sutton & Barto
Exercise: work through the OpenAI Gym tutorial
Exercise: Bandits_in_gym Here we code up the simple bandit algorithm of p 32 in Sutton & Barto, as well as the UCB variant.

Week 3: Theory: Markov Decision Processes (MDPs)

Read third chapter of Sutton & Barto
Optional: Watch Lecture 2 of David Silver
Selected Book Exercises Ch 3

Week 4: Dynamic Programming (DP)

Read fourth chapter of Sutton & Barto
Watch Lecture 3 of David Silver
Exercise: Udacity Notebook for solving FrozenLake using Dynamic Programming.
Optional: Apply DP functions to JacksCarRental Gym environment

Week 5: Monte Carlo (MC) control

Read selected paragraphs from Chapter 5
Exercise: Udacity Notebook for solving the BlackJack env using MC control.

Week 6: Q-learning

Read selected paragraphs from Chapter 6
Exercise: Udacity Notebook on temporal difference (TD) methods (CliffWalking environment).

Week 7: Economic application of Q-learning: algorithmic pricing

Selected papers (ACM, Calvano et al).
Presentation by Jan Svitak (ACM)

Week 8: Programming multi-agent RL using PettingZoo

Exercise 1: PettingZoo introductory tutorial using the Tic-Tac-Toe two player environment.
Exercise 2: PettingZoo Q-learning tutorial using the Tic-Tac-Toe two player environment.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
week_0		week_0
week_1		week_1
week_2		week_2
week_3		week_3
week_4		week_4
week_5		week_5
week_6		week_6
week_7		week_7
week_8		week_8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
play_rl.sh		play_rl.sh
play_rl_pettingzoo.sh		play_rl_pettingzoo.sh
resources.md		resources.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Course outline

Week -1: Course outline

Week 0: Programming with Python

Week 1: Introduction to RL

Week 2: Multi-armed bandits

Week 3: Theory: Markov Decision Processes (MDPs)

Week 4: Dynamic Programming (DP)

Week 5: Monte Carlo (MC) control

Week 6: Q-learning

Week 7: Economic application of Q-learning: algorithmic pricing

Week 8: Programming multi-agent RL using PettingZoo

About

Releases

Packages

Languages

License

gsverhoeven/gt_rl_course

Folders and files

Latest commit

History

Repository files navigation

Course outline

Week -1: Course outline

Week 0: Programming with Python

Week 1: Introduction to RL

Week 2: Multi-armed bandits

Week 3: Theory: Markov Decision Processes (MDPs)

Week 4: Dynamic Programming (DP)

Week 5: Monte Carlo (MC) control

Week 6: Q-learning

Week 7: Economic application of Q-learning: algorithmic pricing

Week 8: Programming multi-agent RL using PettingZoo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages