CSC2542: Topics in Knowledge Representation and Planning and Reasoning about Action, Fall 2016

University of Toronto - Fall 2016
Department of Computer Science

CSC 2542: Topics in KR&R: Algorithms for Sequential Decision Making

CSC2542 - Time Table

Time Table

Day	Topics	Slides	Readings
Sept 15	(Admin) Introduction	(pdf); (4pp)	No reading.
Sept 22	Overview Planning Planning and Search (Part 1)	(pdf); (4pp) (pdf); (4pp)	Read Everything You Always Wanted to Know about Planning (But Were Afraid to Ask)" Joerg Hoffman
Sept 29	Planning and Search (Part 2) Intro to Markov Decision Processes (Admin) Assignment Overview	(pdf); (4pp) (pdf) (pdf); (4pp)	(1) Chapter 3 on Finite Markov Decision Processes Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) (2) Using Alternative Suboptimality Bounds in Heuristic Search Valenzano et al.
Oct 6	Planning as Dynamic Programming Bandits Rick's Bandits Program (Try me!)	(pdf) (pdf): (4pp) (py)	Chapter 4 on Dynamic Programming Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed)
Oct 13	Bandits & Monte Carlo Methods (Admin) Class Project	(pdf); (4pp) (pdf); (4pp)	Chapter 5 on Monte Carlo Methods Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) (Optional) Chapters 1 and 2 on The Reinforcement Learning Problem and Multi-Arm Bandits.
Oct 20	Temporal Difference Learning Eligibility Traces	(pdf); (4pp) (pdf); (4pp)	Chapter 6 on Temporal-Difference Learning Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed) (Optional) Chapter 7 on Eligibility Traces.
Oct 27	Papers and Projects (Admin) Software Resources & Demos Planning (Eldan) Reinforcement Learning (David) Arcade Learning Environment (Juliana)	(pdf); (4pp) (presentation); (info sheet) (pdf); (4pp) (presentation); (info sheet)	3 paper for next week (see individual paper assignments and questions)
Nov 3	Paper Presentations: P2 - NRPA (Hengwei) P4 - RTA* (Van) P13 - PRP (Wuga)	(pdf) (pdf) (pdf)	(1) Please skim Chapter 9 of Sutton & Barto 2nd edition in preparation for next week's lecture. (2) 1 paper for next week (see individual paper assignments and questions)
Nov 10	Paper Presentations: P8 - LAO* (Jonathan) Value Function Approximation	(pdf) (pdf)	3 paper for next week (see individual paper assignments and questions)
Nov 17	Paper Presentations: P9 - MC Tree Search & Go (Yang Qiao) P10 - MC and Finite MDP (Ryan) P24 - Policy Gradient & Actor-Critic Methods (Jon)	(pdf) (pdf) (pdf); (supplemental pdf)	3 paper for next week (see individual paper assignments and questions)
Nov 24	Paper Presentations: P18 - Reward Shaping (Toryn) P17 - Intrinsic Motivation (Amna) P20 - Apprenticeship Learning (Kathy)	(pdf) (pdf) (pdf)	3 paper for next week (see individual paper assignments and questions)
Dec 1	Paper Presentations: P21 - Deep RL (Bo Wen) P22 - Atari w/ Shallow RL (Rodrigo) P23 - AlphaGo (Shun)	(pdf) (pdf) (pdf)	For next week -- Presentations

Back to the main page