University of Toronto - Fall 2016
Department of Computer Science

CSC 2542: Topics in KR&R: Algorithms for Sequential Decision Making

CSC2542 - Time Table




Time Table

Day Topics Slides Readings
Sept 15 (Admin) Introduction (pdf); (4pp)
No reading.
Sept 22 Overview Planning
Planning and Search (Part 1)
(pdf); (4pp)
(pdf); (4pp)
Read Everything You Always Wanted to Know about Planning (But Were Afraid to Ask)" Joerg Hoffman
Sept 29 Planning and Search (Part 2)
Intro to Markov Decision Processes
(Admin) Assignment Overview
(pdf); (4pp)
(pdf)
(pdf); (4pp)
(1) Chapter 3 on Finite Markov Decision Processes Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed)
(2) Using Alternative Suboptimality Bounds in Heuristic Search Valenzano et al.

Oct 6 Planning as Dynamic Programming
Bandits
Rick's Bandits Program (Try me!)
(pdf)
(pdf): (4pp)
(py)
Chapter 4 on Dynamic Programming Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed)

Oct 13 Bandits & Monte Carlo Methods
(Admin) Class Project
(pdf); (4pp)
(pdf); (4pp)
Chapter 5 on Monte Carlo Methods Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed)

(Optional) Chapters 1 and 2 on The Reinforcement Learning Problem and Multi-Arm Bandits.

Oct 20 Temporal Difference Learning
Eligibility Traces
(pdf); (4pp)
(pdf); (4pp)
Chapter 6 on Temporal-Difference Learning Reinforcement Learning: An Introduction Sutton & Barto, (Draft 2nd ed)

(Optional) Chapter 7 on Eligibility Traces.

Oct 27 Papers and Projects (Admin)
Software Resources & Demos
    Planning (Eldan)
    Reinforcement Learning (David)
    Arcade Learning Environment (Juliana)
(pdf); (4pp)

(presentation); (info sheet)
(pdf); (4pp)
(presentation); (info sheet)
3 paper for next week (see individual paper assignments and questions)
Nov 3 Paper Presentations:
    P2 - NRPA (Hengwei)
    P4 - RTA* (Van)
    P13 - PRP (Wuga)

(pdf)
(pdf)
(pdf)
(1) Please skim Chapter 9 of Sutton & Barto 2nd edition in preparation for next week's lecture.

(2) 1 paper for next week (see individual paper assignments and questions)

Nov 10 Paper Presentations:
    P8 - LAO* (Jonathan)
Value Function Approximation

(pdf)
(pdf)

3 paper for next week (see individual paper assignments and questions)
Nov 17 Paper Presentations:
    P9 - MC Tree Search & Go (Yang Qiao)
    P10 - MC and Finite MDP (Ryan)
    P24 - Policy Gradient & Actor-Critic Methods (Jon)


(pdf)
(pdf)
(pdf); (supplemental pdf)

3 paper for next week (see individual paper assignments and questions)
Nov 24 Paper Presentations:
    P18 - Reward Shaping (Toryn)
    P17 - Intrinsic Motivation (Amna)
    P20 - Apprenticeship Learning (Kathy)


(pdf)
(pdf)
(pdf)

3 paper for next week (see individual paper assignments and questions)
Dec 1 Paper Presentations:
    P21 - Deep RL (Bo Wen)
    P22 - Atari w/ Shallow RL (Rodrigo)
    P23 - AlphaGo (Shun)


(pdf)
(pdf)
(pdf)
For next week -- Presentations


Back to the main page