Sample-based Learning Methods Course Syllabus
Full curriculum breakdown — modules, lessons, estimated time, and outcomes.
Overview (80-120 words) describing structure and time commitment.
Module 1: Monte Carlo Methods
Estimated time: 4 hours
- Introduction to Monte Carlo methods for prediction
- Monte Carlo estimation of value functions
- Monte Carlo with exploring starts
- On-policy Monte Carlo control
Module 2: Temporal-Difference Learning
Estimated time: 4 hours
- Understanding TD learning as a hybrid of Monte Carlo and DP
- TD(0) for prediction
- TD error and bootstrapping
- Comparison of Monte Carlo and TD methods
Module 3: TD Control Methods
Estimated time: 4 hours
- Sarsa: on-policy TD control
- Expected Sarsa algorithm
- Q-learning: off-policy TD control
- Comparative analysis of TD control strategies
Module 4: Planning and Learning with Tabular Methods
Estimated time: 4 hours
- Model-based vs. model-free reinforcement learning
- Simulated experience and planning
- Dyna architecture: integrating planning and learning
Module 5: Final Project
Estimated time: 6 hours
- Implement a sample-based reinforcement learning algorithm
- Apply the algorithm to a control task environment
- Analyze performance and convergence behavior
Prerequisites
- Familiarity with probability theory and linear algebra
- Intermediate Python programming skills
- Basic understanding of reinforcement learning concepts
What You'll Be Able to Do After
- Understand and apply Monte Carlo methods for value function estimation
- Implement and compare TD learning algorithms like Sarsa and Q-learning
- Differentiate between on-policy and off-policy control methods
- Enhance learning efficiency using the Dyna architecture
- Apply sample-based methods to real-world decision-making problems