Module overview
This module is structured in two parts: the foundations of data analytics and its applications in credit risk assessment. In the first part, students will be introduced to the core concepts and workflow of data analytics, with a focus on data pre-processing and data mining techniques. Key analytical methods covered include linear regression, classification techniques (such as logistic regression and decision trees), and clustering approaches (including K-means and hierarchical clustering). Essential modelling techniques—such as model selection, regularisation, and cross-validation—will also be explored to ensure robust and interpretable analysis. All practical work will be conducted using the Python programming language. The second part focuses on the application of these techniques to credit risk assessment, particularly in the context of retail credit scoring. It covers data preparation techniques such as cleaning, visualisation, standardisation, binning, and Weight of Evidence (WOE) transformation. Students will learn how to develop and evaluate credit scorecards, and how to measure and compare model performance using appropriate metrics. Ethical and sustainability considerations, such as fairness in model outcomes, will also be addressed. Real-world datasets and case studies will provide practical context throughout.