Module overview
It is important that we provide bioinformatic cell analysis training to students in order to significantly improve research possibilities in their future careers in Biomedical Sciences. The quantitative module in cell biology will focus on the practical use of the methods employed, rather than the mathematics underpinning them. Some of the mathematics will be discussed, but no prior knowledge will be assumed. The analyses will predominantly be conducted using "R".
Students with or without experience of programming/mathematics will be enrolled on this course. Students with no background in this area will not be disadvantaged as they will be provided with computing support to succeed.
There is no opportunity to repeat the year on this programme
Aims and Objectives
Learning Outcomes
Cognitive Skills
Having successfully completed this module you will be able to:
- Understand and summarize different methods of data analysis and critically appraise their appropriate use.
- Apply information identified from published sources to your own investigations.
Knowledge and Understanding
Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:
- Extract and process data from a range of high-throughput experimental sources
- Use basic supervised and unsupervised methods to analyse multivariate biomedical data sets
- Synthesize the results of different methods of analysis and draw appropriate biological conclusions.
Learning Outcomes
Having successfully completed this module you will be able to:
- Produce concise written summaries of your analysis, including interpretation of statistical results in terms of underlying biology.
Transferable and Generic Skills
Having successfully completed this module you will be able to:
- Exercise initiative and personal responsibility
- Organise your own activities to achieve a desired outcome within a limited amount of time.
- Direct your own learning.
Syllabus
0.Session 0: Introduction to the course and software installation, intro to R: 2h
1.Session 1: Introduction to multivariate data analysis to investigate complex disease: data structure, metadata, data pre-processing & intro to EDA
2.Session 2: Statistical models: EDA, T-test, linear regression, intro to clustering
3.Session 3: Data visualization & biological meaning
4.Session 4: RNA-Seq, Multivariate analysis of a transcritptomic dataset: from alignment to biological meaning
5.Session 5: Single cell RNA-Seq, from clustering to biological meaning,
Learning and Teaching
Teaching and learning methods
Teaching will consist of five one-day master classes and one additional introductory session to set up equipment and R environment. Each day will cover one of the 5 syllabus sections. Each session will begin with a taught overview of the material to be covered in the morning, followed by a hands-on session on computer in the afternoon in which the students can explore the various different data types and methods discussed. Example datasets for exploration will be provided at each session. Collaborative working between students will be encouraged during these sessions.
The afternoon sessions will be run with a member of academic staff and 1-2 computational PhD/postdoc demonstrators, of which there are many suitable in Southampton (to be paid at standard demonstration rates).
The training and analysis will primarily be conducted using "R". All methods will be demonstrated in "R", and full code for example problems will be provided; prior knowledge of programming would be beneficial but not required.
Total Study Time
The module will reflect the normal distribution of 200 hours of student effort attributable to each 20 credit module.
Contact hours: 27
Non-contact hours: 173
Lectures may be delivered face to face or online . Tutorials, support and feedback for practical computer workshop sessions may be given through face to face or live online sessions.
Type | Hours |
---|---|
Independent Study | 173 |
Teaching | 27 |
Total study time | 200 |
Resources & Reading list
General Resources
Access to a computer/laptop computer/workstation. The course is based on computational data analysis. Access to a computer or workstation with working R environment and internet access is essential
Textbooks
Hastie, Tibshirani, Friedman. (2009). The Elements of Statistical Learning. Springer.
Bishop (2006). Pattern Recognition and Machine Learning.
Assessment
Assessment strategy
R Test (10%) A short paper-based class test, comprising approximately 10 questions, will be held after weekly session three to monitor and facilitate the acquisition of basic skills required for the primary substantive summative assessments.
Coursework 1.(40%) Description of Methods and Interpretation.
1. A written summary of data visualisation techniques and their applications in cell and molecular biology, including sections on dimensionality reduction and clustering as a minimum, using the references provided in lectures and on Blackboard to start. Students are encouraged to explore a range of different methods, based upon their own reading. 1500 word limit. (0.6 of the mark)
2. Appraising use of data visualisation techniques to derive biological meaning conveyed by them in a set of figures provided. 500 word limit. (0.4 of the mark)
Coursework 2. (50%) Analysis of a dataset utilising an R script. Full problem details and dataset will be provided to the students at the start of the course via Blackboard. Submission will include annotated code. Students will be assessed by the success of their method to achieve a thorough analysis of the dataset provided to include:
1. A fully annotated R script that executes all analysis commands and runs without error (0.3 of this mark section).
2.Preprocessing of the dataset (filtering, normalisation) and exploratory data analysis (0.2 of this mark section)
3. A set of professionally produced figures with appropriate captions that summarizes the analysis (0.3 of this mark section).
3. A biological interpretation of analysis results (0.2 of this mark section)
Assessment requirements
You must pass the module with an average overall mark of 50% or above. There is compensation between assessment elements provided a mark of 40% or higher is attained in each element. Candidates who fail one or more elements of the module at the first attempt will be permitted to re-sit the failed elements as supplementary assessments. Candidates who achieve at least 50% overall at the second attempt will be permitted to pass the module with a capped mark of 50%
Summative
This is how we’ll formally assess what you have learned in this module.
Method | Percentage contribution |
---|---|
Written summary | 40% |
Data Analysis | 50% |
Class Test | 10% |
Referral
This is how we’ll assess you if you don’t meet the criteria to pass this module.
Method | Percentage contribution |
---|---|
Class Test | 10% |
Written summary | 40% |
Data Analysis | 50% |