Skip to main navigationSkip to main content
The University of Southampton

BIOL6052 Data Management and Generalised Linear Modelling for Biologists

Module Overview

Evidence-based ecology, evolution and conservation require quantitative analyses of field data typically collected under imperfectly controlled conditions and across heterogeneous habitats. This module will develop generic skills in (1) the design of data collection protocols, particularly for field experiments and observational studies, and (2) the testing of hypotheses with appropriate statistical models. Although basic statistical awareness will be assumed to undergraduate level, the first third of the module will review core principles of experimental design and analysis that underpin all quantitative methods using the freely distributed environment R ( Thereafter, the module will develop and apply statistical models for data types commonly encountered in fieldwork using the generalised linear model framework. The importance of scripting for transparent data management will be of central importance. Life science questions commonly seek to explain response variables in terms of predictor variables that co-vary with each other or with nuisance variables or cannot be measured in balanced designs. Techniques to resolve these issues will be introduced in a practical approach designed to pre-empt common issues of later independent research projects. The final part of the module will treat these issues in multifactorial and multivariate analysis.

Aims and Objectives

Learning Outcomes

Learning Outcomes

Having successfully completed this module you will be able to:

  • Design a logical and suitable data collection protocol for statistical analysis of a test hypothesis;
  • Identify statistical procedures appropriate to different types of hypotheses and data;
  • Interpret the results of statistical tests on given data sets, and use that interpretation to justify the conclusions you draw;
  • Compute statistical analyses in integrated workflows (importing and cleaning data using transparent, repeatable scripts, exporting report and publication-quality figures) using the R environment;
  • Independently use the freely available R environment.


The course will start by introducing study design. Various types of statistical analysis will be covered, including: Regression, ANOVA (using the General Linear Model), ANCOVA, the use of Linear Mixed Models, Generalized Linear Models and Multivariate Techniques. Students will be taught how to analyse data using the R Environment for Statistical and Graphical analysis in student-led workshops that facilitate peer-to-peer learning. These lessons will be reinforced in “no agenda” interactive feedback sessions and through one-to-one surgeries to assist with the particulars of data sets for programme research projects.

Learning and Teaching

Teaching and learning methods

- Computer workshops will introduce and train in the use of statistical packages for hypothesis testing and presentation of results. - Large-group tutorials led by educator based on anonymized student requests for material to revise. - Panopto Lecture capture. - Formal lectures will provide the framework of core concepts and issues. - Discussion workshops will be interactive sessions.

Practical classes and workshops20
Independent Study122
Total study time162

Resources & Reading list

Fox, J. & Weisberg, S. (2011). An R Companion to Applied Regression. 

Doncaster, C.P. & Davey, A.J.H. (2007). Analysis of Variance and Covariance: How to Choose and Construct Models of the Life Sciences. 

Hector. The New Statistics: An Introduction for Biologists. 

Beckerman, A.P. & Petchey, O.L (2012). Getting Started with R. 





MethodPercentage contribution
Assessment 60%
Exercise 40%


MethodPercentage contribution
Assignment Marks carried forward 60%
Exercise 40%

Repeat Information

Repeat type: Internal & External

Share this module Share this on Facebook Share this on Twitter Share this on Weibo
Privacy Settings