Skip to main navigationSkip to main content
The University of Southampton
Mathematical Sciences

Using structure and content to reveal the evolution of narratives in social media Seminar

Time:
12:00
Date:
20 October 2015
Venue:
54/7033 (7C)

Event details

Applied Mathematics Seminar

The wealth of data available from a variety of sources presents attractive opportunities in academia and beyond. Analysing large datasets and extracting useful information from them is not a trivial task. Often, collections of data have several layers of structure, are complex and noisy. Data from social media and other sources can be processed in many ways; recently we have studied a dataset of relationships among Twitter users who were prominent during the 2011 riots in England. These data consist of the names and descriptions of the users and their mutual relationships (i.e., who follows whom). Although the data did not include the actual messages that passed through these links during the riots, we are able to study the structure of the relationships to reveal information about the users, their interests, hierarchies and roles. Analyses of the network structures created by relationships or interactions between the data-generating agents, however, cannot answer questions such as: what topics do users of social media talk about, and how do these topics and their user participation change in time? To find answers we must go beyond the meta-data and look at the content produced by the users.

We have developed a method to study large, longitudinal collections of textual data that allows us to understand the evolution of discourse and group narratives. Our method uses topic timelines, a concept we have recently introduced. Topic timelines are networks whose nodes are content-units (such as topics) that appear in a given time interval; the edges may depend on the shared authors between the nodes, topical similarity, etc. Handling data in this way creates tractable networks that, for example, not only reveal what topics appear when, but also help to understand the relationship among the different topics in terms of agent participation or similarity. These new networks can be explored using standard tools from network science. For example, by extracting their communities we can track the origin, evolution, and decline of collective narratives; we can identify which seemingly disparate topics are related through their common users, obtain the user turnover of topics, or know when a topic becomes exhausted for some groups of users but not for others.

This method provides a way to make large and complex sets of longitudinal textual data tractable and amenable to analysis with the rich palette of tools from network science, offering a new point of view from which collective discourse can be studied. We showcase our method on collections of Twitter status updates which include conversations about obesity diabetes and the UK's National Health Service. This methodology is applicable not only to social media but to any collection of longitudinal data generated by large numbers of agents.

Speaker information

Mariano Beguerisse-Diaz, Imperial college. Research Fellow

Privacy Settings