Multimodal Artificial Intelligence

Module overview

Multimodal AI is the study and design of artificial intelligence systems that integrate and learn jointly from multiple heterogeneous data sources, which can go from text, images and audio to biomedical signals, sensor streams, and many more. This module introduces the foundational principles of multimodal representation, alignment, and data fusion, and then examines how these techniques are implemented across a range of application domains, like health or audiovisual domains, among others. Students will learn how heterogeneous modalities are combined within a single system and how such systems are evaluated for robustness, bias, and reliability. The module also considers the social and regulatory implications of deploying multimodal AI technologies, including responsible and trustworthy AI.

Aims and Objectives

Learning Outcomes

Knowledge and Understanding

Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:

Fundamental concepts of multimodal machine learning.
Data fusion techniques and their strengths and weaknesses depending on the downstream application.
Limitations, risks, and practical applications of multimodal machine learning.

Subject Specific Practical Skills

Having successfully completed this module you will be able to:

Select and implement an appropriate data fusion strategy for a multimodal learning task.
Interpret and reason about complex multimodal data at varying granularities.

Subject Specific Intellectual and Research Skills

Having successfully completed this module you will be able to:

Critically evaluate responsible and trustworthy AI principles as applied to multimodal AI systems across different use cases.
Critically evaluate emerging multimodal machine learning techniques using evidence from current research literature.

Learning and Teaching

Teaching and learning methods

Lectures, labs and guided self-study

Study time
Type	Hours
Specialist Laboratory	9
Wider reading or practice	32
Lecture	36
Completion of assessment task	45
Preparation for scheduled sessions	10
Revision	18
Total study time	150

Resources & Reading list

Textbooks

Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016). Deep Learning. MIT Press.

Assessment

Summative

This is how we’ll formally assess what you have learned in this module.

Breakdown
Method	Percentage contribution
Examination	60%
Coursework	40%

Referral

This is how we’ll assess you if you don’t meet the criteria to pass this module.

Breakdown
Method	Percentage contribution
Examination	100%

Repeat

An internal repeat is where you take all of your modules again, including any you passed. An external repeat is where you only re-take the modules you failed.

Breakdown
Method	Percentage contribution
Examination	100%