Day | Topics | Slides | Readings | |
---|---|---|---|---|
Sept 15 | (Admin) Introduction |
(pdf);
(4pp)
|
No reading. | |
Sept 22 | Overview Planning Planning and Search (Part 1) |
(pdf);
(4pp)
(pdf); (4pp) |
Read Everything You Always Wanted to Know about Planning (But Were Afraid to Ask)" Joerg Hoffman | |
Sept 29 |
Planning and Search (Part 2) Intro to Markov Decision Processes (Admin) Assignment Overview |
(pdf);
(4pp)
(pdf) (pdf); (4pp) |
(1) Chapter 3 on Finite Markov Decision Processes
Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) (2) Using Alternative Suboptimality Bounds in Heuristic Search Valenzano et al. |
|
Oct 6 |
Planning as Dynamic Programming Bandits Rick's Bandits Program (Try me!) |
(pdf)
(pdf): (4pp) (py) |
Chapter 4 on Dynamic Programming
Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) |
|
Oct 13 |
Bandits & Monte Carlo Methods (Admin) Class Project |
(pdf);
(4pp)
(pdf); (4pp) |
Chapter 5 on Monte Carlo Methods
Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) (Optional) Chapters 1 and 2 on The Reinforcement Learning Problem and Multi-Arm Bandits. |
|
Oct 20 |
Temporal Difference Learning Eligibility Traces |
(pdf);
(4pp)
(pdf); (4pp) |
Chapter 6 on Temporal-Difference Learning
Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) (Optional) Chapter 7 on Eligibility Traces. |
|
Oct 27 |
Papers and Projects (Admin) Software Resources & Demos     Planning (Eldan)     Reinforcement Learning (David)     Arcade Learning Environment (Juliana) |
(pdf);
(4pp) (presentation); (info sheet) (pdf); (4pp) (presentation); (info sheet) |
3 paper for next week (see individual paper assignments and questions) |
|
Nov 3 |
Paper Presentations:     P2 - NRPA (Hengwei)     P4 - RTA* (Van)     P13 - PRP (Wuga) |
(pdf) (pdf) (pdf) |
(1) Please skim Chapter 9 of Sutton & Barto 2nd edition in preparation for
next week's lecture.
(2) 1 paper for next week (see individual paper assignments and questions) |
|
Nov 10 |
Paper Presentations:     P8 - LAO* (Jonathan) Value Function Approximation |
(pdf) (pdf) |
3 paper for next week (see individual paper assignments and questions) |
|
Nov 17 |
Paper Presentations:     P9 - MC Tree Search & Go (Yang Qiao)     P10 - MC and Finite MDP (Ryan)     P24 - Policy Gradient & Actor-Critic Methods (Jon) |
(pdf) (pdf) (pdf); (supplemental pdf) |
3 paper for next week (see individual paper assignments and questions) |
|
Nov 24 |
Paper Presentations:     P18 - Reward Shaping (Toryn)     P17 - Intrinsic Motivation (Amna)     P20 - Apprenticeship Learning (Kathy) |
(pdf) (pdf) (pdf) |
3 paper for next week (see individual paper assignments and questions) |
|
Dec 1 |
Paper Presentations:     P21 - Deep RL (Bo Wen)     P22 - Atari w/ Shallow RL (Rodrigo)     P23 - AlphaGo (Shun) |
(pdf) (pdf) (pdf) |
For next week -- Presentations |
|