Skip to main navigationSkip to main content
The University of Southampton

MANG6054 Credit Scoring and Data Mining

Module Overview

The module will start by defining the concept of Knowledge Discovery in Data (KDD) as consisting of three steps: data pre-processing, data mining and post-processing. Next, we will zoom into the data mining step and distinguish two types of data mining: descriptive data mining (e.g. clustering, association and sequence rules) and predictive data mining (e.g. regression and classification). The module will then illustrate how KDD can be successfully used to develop credit scoring applications, where the aim is to distinguish good customers from bad customers (defaulters) given their characteristics. The importance of developing good credit scoring models will be highlighted in the context of the Basel II and III guidelines. The theoretical concepts will be illustrated using real-life credit scoring cases and the SAS Enterprise Miner software.

Aims and Objectives

Learning Outcomes

Knowledge and Understanding

Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:

  • the potential of KDD and data mining for developing scorecards.
Subject Specific Intellectual and Research Skills

Having successfully completed this module you will be able to:

  • work with software to develop credit scoring solutions; develop a scorecard using data mining techniques.
Transferable and Generic Skills

Having successfully completed this module you will be able to:

  • critically analyse practical difficulties that arise when implementing scorecards; understand the cross- fertilisation potential to other business contexts (e.g. fraud detection, CRM).


Introduction: • Knowledge Discovery in Data • The KDD process model • Descriptive versus predictive data mining • Credit scoring: problem statement, origins and objectives • The Basel II and III regulations • Risk management • Consumer credit scoring, Behavioural Scoring, Collection Scoring, Bankruptcy Prediction • Risk Based Pricing (customization of credit products) • Customised scorecards versus generic scorecards • Developing scorecards Data pre-processing: • Selecting the sample • Segmentation • Example variables needed for application and behavioural scoring • Oversampling versus Undersampling • Credit scoring characteristics • Application form characteristics • Credit bureau characteristics • Reject inference • Definitions of good and bad • Binary versus three-way classification (good, bad, and indeterminate) • Outlier detection • Missing values • Nominal variables versus Ordinal variables Data mining: • Basic concepts of classification • Classification techniques (logistic regression, decision trees, neural networks) • Overfitting versus generalisation • Input selection (Filters, Wrappers …) • Setting the cut-off • Measuring scorecard performance (ROC curves, Lift, Gini …) Post processing: • Reporting • Strategy curve • Profit scoring • Recalibrating scorecards • Tracking scorecards

Learning and Teaching

Teaching and learning methods

The module is delivered through pre-course reading and lectures. The various concepts will be illustrated using real-life credit scoring data and software. In addition there will be some in-lecture exercises.

Independent Study63
Total study time75

Resources & Reading list

Thomas, L.C., Edelman, D.B. and Crook, J.N. (2002). Credit Scoring and Its Applications. 

Thomas, L.C. (2009). Consumer Credit Models: Pricing, Profit, and Portfolios. 

Baesens, B. ( (2014). Analytics in a Big Data World: The Essential Guide to Data Science and its Applications. 

Baesens, B., Rosch, D., Scheule, H (2016). Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. 

Anderson, R. (2007). The Credit Scoring Toolkit. 

Van Gestel, T. and Baesens, B. (2009). Credit Risk Management. Basic concepts: financial risk components, rating analysis, models, economic and regulatory capital. 

Hastie, T., Tibshirani, R., and Friedman, J (2013). The Elements of Statistical Learning. 



Set exercises - non-exam


MethodPercentage contribution
Coursework  (2000 words) 100%


MethodPercentage contribution
Coursework  (2000 words) 100%


MethodPercentage contribution
Coursework  (2000 words) 100%

Repeat Information

Repeat type: Internal & External


Costs associated with this module

Students are responsible for meeting the cost of essential textbooks, and of producing such essays, assignments, laboratory reports and dissertations as are required to fulfil the academic requirements for each programme of study.

In addition to this, students registered for this module typically also have to pay for:


Recommended texts for this module may be available in limited supply in the University Library and students may wish to purchase the mandatory/additional reading text as appropriate.

Please also ensure you read the section on additional costs in the University’s Fees, Charges and Expenses Regulations in the University Calendar available at

Share this module Share this on Facebook Share this on Twitter Share this on Weibo
Privacy Settings