Skip to main navigationSkip to main content
The University of Southampton
Web Science Institute

Digital Police Officer: Linguistic Analysis to Identify Cybercriminals


The concept of ‘trust’ is paramount online: users encounter one another in an unmediated space in which impersonation is easy, and phishing and harassment are common . Concerns about the impact of untrustworthy and criminal behaviours on the Web have grown substantially in recent years, with cybercrime costing UK businesses an estimated average of £3m a year .

It is notoriously difficult to establish trust online: for example, consider a user who has been expelled from an online forum, but returns using another nickname. How can we know that the person we are interacting with is not this individual? How could anyone seeking to police the Internet determine if it is the banned individual

This project investigates linguistic analysis as a possible solution, through a demo to identify an online user by the way in which they communicate. The approach is to analyse the characteristics of forum users (i.e. based on their vocabulary and grammar) to build a linguistic fingerprint using Natural Language Processing technologies.

The effectiveness of the approach will be assessed in the context of law enforcement. We consider carding forums, venues for buying and selling stolen credit card data. When one carding forum closes, another rises: criminals with a trusted reputation on a defunct site 'port' their reputations to another without necessarily keeping the same username or other identifying features. DPO will determine to what extent current linguistic analysis technologies can track such users, with potential applications in interpersonal trust online as well as law enforcement.


Principal Investigator: Dr Clare Hooper, IT Innovation

Co-Investigator: Dr Craig Webber, Criminology, Social Sciences


Final report.


Privacy Settings