The University of Southampton
Courses

COMP6235 Foundations of Data Science

Module Overview

Welcome to the Foundations of Data Science! 'Data Scientist' has been described as the sexiest job of the 21st century, with the demand for highly skilled practitioners rising quickly to leverage the increasing amount of data available for study. As the amount of data increases, so too does the need for employees who can extract meaningful insights from this data. This course is designed to introduce you to a range of topics and concepts related to the data science process. It will cover the technical pipeline from data collection, to processing, analysis and visualisation. You will be introduced to and gain knowledge of various topics such as statistics, crawling data, data visualisation, advanced databases and cloud computing, along with a toolkit to use with data (including R, D3, Google Refine and Hadoop). The course will include a mix of lectures, tutorials, hands-on exercises and invited talks from expert data science practitioners. Coursework will allow you to gain experience using the theory and techniques delivered in the lectures, while the group project will give you the chance to apply knowledge of the data science process and toolkit in the development of a data science application.

Aims and Objectives

Module Aims

To introduce a range of topics and concepts related to the data science process

Learning Outcomes

Knowledge and Understanding

Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:

  • Key concepts in data science, including tools, approaches, and application scenarios
  • Topics in data collection, sampling, quality assessment and repair
  • Topics in statistical analysis and machine learning
  • Topics in data processing at scale
  • State-of-the-art tools to build data-science applications for different types of data, including text and CSV data
Subject Specific Practical Skills

Having successfully completed this module you will be able to:

  • Solve real-world data-science problems and build applications in this space
Subject Specific Intellectual and Research Skills

Having successfully completed this module you will be able to:

  • Understand and apply the fundamental concepts and techniques in data science

Syllabus

The course will introduce students to the data scientist toolkit and the underlying core concepts. It will cover the full technical pipeline from data collection (sampling methods, crawling) to processing and basic notions of statistical analysis and visualization. The module will also include advanced topics in high-performance computing, including non-relational databases and MapReduce. By taking this course the students will be provided with the basic toolkit to work with data (CSV, R, MongoDB). To support these learning objectives, the coursework will include exercises and a group project in which students will use existing open data sets and build their own application. The course will cover the following concepts: - Fundamentals and core terminology - Technology pipeline and methods - Application scenarios and state of the art - Data collection (sampling, crawling) - Data analytics (statistical modeling, basic concepts, experiment design, pitfalls, R) - Data interpretation and use (visualization techniques, pitfalls, D3) - High-performance computing (parallel databases, MapReduce, Hadoop, NoSQL) - Cloud computing (principles, architectures, existing technologies)

Learning and Teaching

Teaching and learning methods

Lectures and tutorials, as well as coursework (group project, exercises).

TypeHours
Lecture12
Completion of assessment task71
Tutorial24
Preparation for scheduled sessions6
Follow-up work6
Wider reading or practice31
Total study time150

Assessment

Summative

MethodPercentage contribution
Group project  ( words) 70%
Report 15%
Statistical report 15%

Referral

MethodPercentage contribution
Coursework assignment(s) 100%

Repeat Information

Repeat type: Internal & External

Share this module Facebook Google+ Twitter Weibo

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×