To introduce, via a hands-on approach, the basic concepts and principals in statistical modelling in a computational paradigm.
After taking this module, students should understand
- why statistical modelling is important,
- the terminology and statistical principles associated with modelling,
- sufficient theory to deal with simple examples and have gained practical hands-on experience in more complex examples,
- how to use Python and R to fit, explore and exploit a variety of statistical models
Introduction and revision
- Python and R, and their interface
- Data input, plotting and summaries
- Standard statistical distributions
- Principles of statistical inference
Regression: linear and generalised linear modelling
- Model construction and estimation
- Model selection and information criteria
- Shrinkage regression (Lasso and ridge methods)
Random effects, mixed models, and data with complex correlation structures
- Grouping structures in data
- Interpretation of random effects and mixed models
- Discrete data and generalised linear mixed models
- Estimation of mixed models
- Autoregression models
Smoothing and nonparametric regression
- Kernel density estimation
- Splines and penalised splines
- Generalised additive models
- Linear smoothing
Data collection for computational studies
- Fundamentals of design of experiments
- Computer and simulation experiments
- Latin hypercube sampling
Study time allocation
Private study hours:90
Total study time:
Teaching and learning methods
- 24 lecture hours
- 36 computer workshop hours
- Individual study facilitated via weekly worksheets to support lecture material and assessed coursework
- Supervised problem solving via computer lab sessions
Resources and reading list
For resources which are required or considered useful for the module: key texts, text books, data books, software, web sites, other sources of related information.
Description and/or list, with URL, library reference, etc
Software: Python and R (freely available)
Textbooks: no required textbooks but the following texts are considered useful:
- Davison, A.C. (2008). Statistical Models. CUP (QA 276 DAV).
- Faraway, J. (2014). Linear Models with R, (2nd Edn). Chapman and Hall/CRC (QA 279 FAR).
- Gelman, A. and Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. CUP (HA 31.3 GEL).
- Wood, S.N. (2006). Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC (QA 274.73 WOO).
- Wu, C.F.J. and Hamada, M. (2011). Experiments: planning, analysis and optimisation, (2nd Edn). Wiley (QA 279 WU).
||% contribution to final mark
||Final assessment (x)
|Coursework (formative and summative)
||100% (50% each)
|Verbal feedback on (unassessed) worksheets and exercises in lab sessions
Written feedback on both assessed pieces of coursework
|| % contribution to final mark
|Repeat a suitable modified piece of coursework
Method of repeat year: Repeat year internally or externally