Skip to main navigationSkip to main content
The University of Southampton
Web Science Institute

The Web Science DTC/CDT hosted the 2015 Web Science Research Sandpit (23rd March - 24th April 2015) which focused on research and innovation in the Web, Internet and Digital Economy

Published: 24 February 2015
Web Science CDT

The research sandpit was held to answer questions and generate insight on issues relevant to the Web, Internet and Digital Economy, build relationships between Web Science partners, and facilitate networking opportunities for students and trainee researchers and influence their future research directions and careers.

Our external supporters nominated research questions that can be addressed by collecting and analysing data to create new understanding of an issue that their business faces, or by undertaking some horizon scanning to report on the latest capabilities in an emerging area of opportunity.

The research projects that were investigated by our MSc Students and PhD Researcher are listed below along with links to their summaries.

Digital Economy Catapult Proposal

As wearable and sensing devices are increasingly connected to the internet they raise new issues of privacy complexity. As the Internet of Things spreads, how will citizens react, what expectations will they have about their personal data? Aspects of these issues can be studied even with the current generation of tracking apps whose users are encouraged to post information to social networks. For example, users of RunKeeper often publicly post their detailed GPS tracking information on Twitter, users of Jawbone Up bracelets may post detailed sleep pattern information. We propose some quantitative research to determine the extent to which this sensed personal data is already being publicly shared, and some qualitative research to understand attitudes and expectations around privacy, identity, permanence, searchability/findability and trust for data from wearables or sensors.


Challenge: “Convergence of Reality and Virtuality”. It is argued by the author that within the next two decades that there will at an abstract level be a ‘melding’ of real and virtual worlds in the eyes of Generation Z+ (2000+ babies). Even without evolving the concept of “brain-taps”, the rapid developments in haptic interfaces, cognitive interfaces, autonomy, new generation of immersive technologies, advanced visualisation, graphics and so forth will provide new paradigms for gaming, entertainment, education, teaching etc. for this generation. The research challenge is for the ‘sandpit’ to investigate, produce evidence and report on the following

  1. What are the S&T developments (what, why, how, when and who (most likely)) that will have been productionised by 2035 and in cases combined together to offer a system/service?
  2. What are the potential implications of (1)? This in terms of sociological, psychological, economic aspects as well as security, privacy.

Haymarket Consumer Media

Is there a reliable means using freely available data to quickly evaluate a global view of a product's reputation online, and how it changes over time? This could be a car, a smartphone, a fridge, a television... any popular consumer product. And by reputation, we mean portraying an easily digestible, aggregated view of positive / negative sentiment, along with clear, succinct descriptions of common strengths and weaknesses. Can you predict an athlete's future success based on their place of birth? What correlation is there between your home town and your potential for future sports superstardom?

Office for National Statistics

Proposal for web scraping of communal establishment information. Introduction to the address register and communal establishment. The address register is central to the operational and statistical design of the Census. A high quality address register therefore has the benefits of ensuring operational efficiency and high quality statistical outputs. It is critical that the address register is able to distinguish between residential household addresses and communal establishments (CE). It is important to accurately identify all CE types as they require specific, often costly, enumeration processes. CEs can contain large populations; it is essential these populations are accurately surveyed as they can have a big impact on statistical outputs, especially at lower geographies.  Examples of CEs include: Prisons, University halls, Care homes, and Army barracks. The CE address register needs to contain not only the address and the type of CE, but also an estimate of the number of residents. The CE address register also needs to include special enumeration households such as individual caravans in caravan parks, and lived-in boats in marinas.

Research proposal

The research should be split into two phases. The first phase should review what information is available online about different types of CEs for contribution to the CE address register and to assess whether there are legal issues that would prevent such websites being scraped.  For each individual CE, ONS would like to understand:

  • Its location
  • Contact details such as address, phone number and name of manager
  • Its size (such as the number of rooms or caravans it has)
  • Information on residents (e.g. any age restrictions on residents)
  • Information of facilities and services (e.g. broadband availability)
  • Any other information of relevance.

The second phase of the project is dependent on the outcome of the first phase. ONS would like to establish the feasibility of producing a prototype web scraper from one website which can automatically collect such information.

Ordnance Survey

How can we identify where real-world change is happening from information on the web and in social media sources? Supporting info: Types of change range from new construction works to change in business names. Current change detection methods use image processing techniques and other directly sourced information; this research question is interested in what change can be identified from web based information. Please click on the link below for the project summary and outcomes:


What is the 'cloud'? Cloud is a very prevalent term in the Web industry. Amazon have claimed their AWS offering is the future of their business - Google is in the same place with GCE. What does' cloud' really mean for the future of Web services? And what are the commercial, technical and philosophical motivations behind it.  Please click on the link below for the project summary and outcomes:


IBM - New York City 360°

ACM Multimedia Grand Challenge Solutions. The explosion of social media provides the possibility of observing and understanding a distant city. In the past, people can only obtain a rough impression by hearing from friends, reading books or newspapers, or watching TV. We expect that the flourish of social media will bring a revolutionary change. The topics of this grand challenge include but are not limited to:

  • Discovering and describing different foods
  • Identifying fashion trends
  • Finding interesting stories of daily life
  • Describing major public activities and events
  • Following local sports and fan activities
  • Gaining insights into political issues and opinions
  • Capturing different art, music, entertainment and cultural activities.

How to participate

If you are a MSc or PhD researcher at the University of Southampton, you can choose to be part of a research group for the duration of the Sandpit. You don't have to be in the Web Science DTC/CDT to participate. After reading the research project proposals, you might find your research is closely linked to Web Science. If you are interested in reading the proposals, please email Claire Wyatt, c.wyatt at are unable to post all the proposals here due to confidentiality reasons).

Contact the DTC Programme Manager, Claire Wyatt (c.wyatt at if you would like to get involved.





Privacy Settings