This module is only compulsory for the MSc Genomics (Informatics) pathway, and optional for other pathways.
This module will allow students to develop skills in analysis of data generated by different omic technologies, particularly giving experince in the analysis of transcriptomic and cancer genomic data using command line tools.
Aims and Objectives
Having successfully completed this module you will be able to:
- Design and apply appropriate machine learning approaches to analyse complex and high dimensional biological datasets (NGS, proteomics and clinical datasets)
- Apply appropriate tools for splice-aware cDNA sequence alignment, quantification of various aspects of transcription (for example gene, exon, transcript abundance) and differential gene expression (DGE) analysis in the Linux environment
- Develop strategies to prioritise candidate genes from DGE for further study (eg enrichment analysis, co-expression analysis)
- Create basic scripts and pipelines for the automated analysis of NGS datasets in the Linux environment
- Apply appropriate tools for quality control, sequence alignment, variant calling, annotation and variant filtration to identify potentially pathogenic variants, including somatic driver mutations, in the Linux environment
- Introduction to the Linux command line and important commands. Combining commands and redirecting them, and writing basic scripts to document and replicate analyses
- Using command line tools for data pre-processing, manipulation of VCF files and customised assessments of sequence coverage
- Principles of sequencing tumour normal pairs to identify somatic mutations
- Somatic variant discovery, familiarity with the statistical significance of somatic mutations (somatic P-value), annotation using multiple databases including ClinVar and COSMIC and estimation of somatic driver mutations using in-silico tools
- Principles of RNA sequencing to determine gene expression profiles
- Understanding split-read mapping and aligning RNAseq data to the reference genome
- How to identify known and novel transcripts, quantify expression and perform differential expression analyses at various levels (genes, exons and transcripts) using appropriate software
- Understand the benefits and limitations of popular machine learning methods in the generation of new knowledge from complex and high dimensional biological datasets.
- Introduction to pathway analysis, using basic tools for network analysis, network visualisation and modelling biological processes
Learning and Teaching
Teaching and learning methods
The module will comprise two blocks of intensive teaching, each followed by approximately two weeks of independent study.
A variety of learning and teaching methods will be adopted to promote a wide range of skills and meet the differing learning styles of the group.
The teaching will include seminars, practical demonstrations, discussions and exercises surrounding interpretation of data and clinical scenarios, and specialist lectures given by a range of academics. This will ensure a breadth and depth of perspective, giving a good balance between background theories and principles and practical experience.
|Total study time||150|
The assessment for the module provides you with the opportunity to demonstrate achievement of the learning outcomes. In addition to the summative assessments, during the course of the module there will be opportunities to obtain feedback in the form of unassessed, formative activities.
The pass mark for this module is 50%; if you have failed the module, the Board of Examiners may offer you the opportunity to submit work at the next referral (re-sit) opportunity.
This is how we’ll give you feedback as you are learning. It is not a formal test or exam.Workshop activities
This is how we’ll formally assess what you have learned in this module.
|Short answer questions||50%|
|Analysis and report||50%|
This is how we’ll assess you if you don’t meet the criteria to pass this module.
Repeat type: Internal & External