Module overview
Linked modules
Pre-requisite: COMP3206 or COMP3223 or COMP6229 or COMP6245
Aims and Objectives
Learning Outcomes
Subject Specific Intellectual and Research Skills
Having successfully completed this module you will be able to:
- Critical appraisal of recent scientific literature in reinforcement and online learning
- Critically appraise the merits and shortcomings of model architectures on specific problems
Subject Specific Practical Skills
Having successfully completed this module you will be able to:
- Gain facility in working with reinforcement and on-line learning algorithms in order to create and evaluate their performance and applicability in different application domains
- Apply existing reinforcement and on-line learning models to real applications
Knowledge and Understanding
Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:
- The key factors that have made reinforcement and on-line learning successful for various applications
- Underlying mathematical and algorithmic principles of reinforcement and online learning
Syllabus
Classical Reinforcement
- TD learning
- Q learning
- State Space Models
- Example: TD-Gammon
On-line Learning
- Regret minimisation
- Stochastic vs. adversarial
- Full information, semi-bandit, and bandit feedback
Monte Carlo Tree Search (MCTS)
Applications
- AlphaZero: combining MCTS, reinforcement learning and deep learning
- Hyper-parameter search in deep learning with bandit theory
- Playing no-limit poker with counterfactual regret minimisation
Learning and Teaching
Teaching and learning methods
Lectures and labs
Type | Hours |
---|---|
Completion of assessment task | 60 |
Wider reading or practice | 46 |
Lecture | 24 |
Specialist Laboratory | 20 |
Total study time | 150 |
Resources & Reading list
Textbooks
Richard Sutton and Andrew Barto (2017). Reinforcement Learning: An Introduction.
Csaba Szepesvari (2010). Algorithms for Reinforcement Learning.
Assessment
Summative
This is how we’ll formally assess what you have learned in this module.
Method | Percentage contribution |
---|---|
Continuous Assessment | 100% |
Referral
This is how we’ll assess you if you don’t meet the criteria to pass this module.
Method | Percentage contribution |
---|---|
Set Task | 100% |
Repeat
An internal repeat is where you take all of your modules again, including any you passed. An external repeat is where you only re-take the modules you failed.
Method | Percentage contribution |
---|---|
Set Task | 100% |
Repeat Information
Repeat type: Internal & External