Module overview
The module will start by defining the concept of Knowledge Discovery in Data (KDD) as consisting of three steps: data pre-processing, data mining and post-processing. Next, we will zoom into the data mining step and distinguish two types of data mining: descriptive data mining (e.g. clustering, association and sequence rules) and predictive data mining (e.g. regression and classification). The module will then illustrate how KDD can be successfully used to develop credit scoring applications, where the aim is to distinguish good customers from bad customers (defaulters) given their characteristics. The importance of developing good credit scoring models will be highlighted in the context of the Basel II and III guidelines. The theoretical concepts will be illustrated using real-life credit scoring cases and the SAS Enterprise Miner software.
Aims and Objectives
Learning Outcomes
Subject Specific Intellectual and Research Skills
Having successfully completed this module you will be able to:
- work with software to develop credit scoring solutions; develop a scorecard using data mining techniques.
Transferable and Generic Skills
Having successfully completed this module you will be able to:
- critically analyse practical difficulties that arise when implementing scorecards; understand the cross- fertilisation potential to other business contexts (e.g. fraud detection, CRM).
Knowledge and Understanding
Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:
- the potential of KDD and data mining for developing scorecards.
Syllabus
Introduction:
- Knowledge Discovery in Data
- The KDD process model
- Descriptive versus predictive data mining
- Credit scoring: problem statement, origins and objectives
- The Basel II and III regulations
- Risk management
- Consumer credit scoring, Behavioural Scoring, Collection Scoring, Bankruptcy Prediction
- Risk Based Pricing (customization of credit products)
- Customised scorecards versus generic scorecards
- Developing scorecards
Data pre-processing:
- Selecting the sample
- Segmentation
- Example variables needed for application and behavioural scoring
- Oversampling versus Undersampling
- Credit scoring characteristics
- Application form characteristics
- Credit bureau characteristics
- Reject inference
- Definitions of good and bad
- Binary versus three-way classification (good, bad, and indeterminate)
- Outlier detection
- Missing values
- Nominal variables versus Ordinal variables
Data mining:
- Basic concepts of classification
- Classification techniques (logistic regression, decision trees, neural networks)
- Overfitting versus generalisation
- Input selection (Filters, Wrappers …)
- Setting the cut-off
- Measuring scorecard performance (ROC curves, Lift, Gini …)
Post processing:
- Reporting
- Strategy curve
- Profit scoring
- Recalibrating scorecards
- Tracking scorecards
Learning and Teaching
Teaching and learning methods
The module is delivered through pre-course reading and lectures. The various concepts will be illustrated using real-life credit scoring data and software. In addition there will be some in-lecture exercises.
Type | Hours |
---|---|
Independent Study | 63 |
Teaching | 12 |
Total study time | 75 |
Resources & Reading list
Textbooks
Hastie, T., Tibshirani, R., and Friedman, J (2013). The Elements of Statistical Learning. NJ USA: Springer.
Baesens, B., Rosch, D., Scheule, H (2016). Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. NJ, USA: Wiley.
Thomas, L.C., Edelman, D.B. and Crook, J.N. (2002). Credit Scoring and Its Applications. Philadelphia, PA: SIAM.
Thomas, L.C. (2009). Consumer Credit Models: Pricing, Profit, and Portfolios. New York: Oxford University Press.
Baesens, B. ( (2014). Analytics in a Big Data World: The Essential Guide to Data Science and its Applications. Hoboken, NJ: Wiley.
Anderson, R. (2007). The Credit Scoring Toolkit. New York: Oxford University Press.
Van Gestel, T. and Baesens, B. (2009). Credit Risk Management. Basic concepts: financial risk components, rating analysis, models, economic and regulatory capital. New York: Oxford University Press.
Assessment
Formative
Formative assessment description
Set exercises - non-examSummative
Summative assessment description
Method | Percentage contribution |
---|---|
Coursework | 100% |
Referral
Referral assessment description
Method | Percentage contribution |
---|---|
Coursework | 100% |
Repeat
Repeat assessment description
Method | Percentage contribution |
---|---|
Coursework | 100% |
Repeat Information
Repeat type: Internal & External