Skip to main navigationSkip to main content
The University of Southampton
Courses

COMP6237 Data Mining

Module Overview

The challenge of data mining is to transform raw data into useful information and actionable knowledge. Data mining is the computational process of discovering patterns in data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and data management. This course will introduce key concepts in data mining, information extraction and information indexing; including specific algorithms and techniques for feature extraction, clustering, outlier detection, topic modelling and prediction of complex unstructured data sets. By taking this course you will be given a broad view of the general issues surrounding unstructured and semi-structured data and the application of algorithms to such data. At a practical level you will have the chance to explore an assortment of data mining techniques which you will apply to problems involving real-world data.

Aims and Objectives

Module Aims

To explore the role of data mining in solving real-world problems

Learning Outcomes

Knowledge and Understanding

Having successfully completed this module, you will be able to demonstrate knowledge and understanding of:

  • Key concepts, tools and approaches for data mining on complex unstructured data sets (including multimedia mining, Twitter analysis, etc)
  • Natural language processing techniques for extracting features from text
  • The theory behind modern data indexing systems
  • Techniques for modelling and extracting features from non-textual data
  • State-of-the-art data-mining techniques including topic modelling approaches such as LDA, clustering techniques and applications of matrix factorisations
  • Theoretical concepts and the motivations behind different data-mining approaches
Subject Specific Intellectual and Research Skills

Having successfully completed this module you will be able to:

  • Conceptually understand the role of data-mining, together with the mathematical techniques this requires
Subject Specific Practical Skills

Having successfully completed this module you will be able to:

  • Solve real-word data-mining, data-indexing and information extraction tasks

Syllabus

Key concepts: - The importance of data-mining - Real-world applications of data-mining (cyber-security, financial forecasting, trend prediction, etc) - What is unstructured data -- Modalities of data - Underlying techniques -- Inverted indexes -- Matrix factorisation -- Dimensionality reduction Modelling data: - Understanding Text -- Bags of Words -- TF-IDF - Dealing with non-textual data -- Feature extraction techniques -- Bags of features -- Encoding and embedding Modern data indexing at scale - Information retrieval models - Ranking models Unimodal data mining: - Topic modelling (techniques such as LSA, pLSA, LDA, NNMF) - Clustering (Hierarchical agglomerative, Spectral) - Multi-dimensional scaling - Mining graphs and networks (hubs and authorities [PageRank/HITS], spectral methods, etc.) - Finding outliers Multimodal data mining: - Finding independent features (e.g ICA, NNMF) - Finding correlations and making predictions (CL-LSI, classifiers, etc.) - Collaborative filtering and recommender systems

Learning and Teaching

TypeHours
Revision10
Lecture24
Wider reading or practice46
Tutorial16
Preparation for scheduled sessions12
Completion of assessment task30
Follow-up work12
Total study time150

Resources & Reading list

Toby Segaran (2007). Programming Collective Intelligence: Building Smart Web 2.0 Applications. 

Assessment

Summative

MethodPercentage contribution
Examination  (2 hours) 50%
Group Coursework 30%
Practical assessment 20%

Repeat

MethodPercentage contribution
Examination 100%

Referral

MethodPercentage contribution
Examination  (2 hours) 100%

Repeat Information

Repeat type: Internal & External

Linked modules

Prerequisite: COMP3206 or COMP3222 or COMP3223 or COMP6229 or COMP6245 or COMP6246

Share this module Share this on Facebook Share this on Google+ Share this on Twitter Share this on Weibo

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×