Stuart's research interest lies in the Natural Language Processing (NLP) areas of information extraction and human-in-the-loop NLP. In a juxtaposition to big dataset NLP, his research focuses on developing novel solutions to problems where training datasets are small, evolving, sparse or fragmented in nature. He is interested in investigating socio-technical NLP approaches, including few/zero-shot learning, graph-based models, text classification, active learning, adversarial training and argument mining. His 60+ peer reviewed publications focus on ACL, ACM and IEEE events and journals but working with researchers from other disciplines he has also published to a variety of inter-disciplinary conferences and journals.
Natural Language Processing
Information Extraction - text/behaviour classification, few/zero shot learning, graph-based models, geoparsing/location extraction, knowledge-base population, tabular data extraction, temporal extraction, event/topic extraction, argument mining.
Human-in-the-loop NLP - active learning, adversarial training, rationale-based training, interactive sense making.
Other - trustworthy AI, digital text forensics, domain adaption.
Law enforcement, Defence, Mental Health, Environmental Science, Legal, Misinformation.
Examples of Impact and Outreach
Commercial : Innovate UK, Tackling challenges, building prosperity: The Industrial Strategy Challenge Fund, Orbital Witness: New technology spots key legal issues in real estate transactions, NLP research transfered to 2 person startup Orbital Witness helping them win £3.85 million venture capital during project lifetime, LPLP project 2021 Innovate UK link, PDF (see page 47)
Software : Middleton, S.E., geoparsing algorithm 'geoparsepy' is available open source from PyPI, averaging 1,500 downloads a month in 2021 [source pypistats.org] PyPI stats geoparsepy
Policy : Middleton, S.E., Invited AI Expert, UK Cabinet Office, London, Ministerial AI Roundtable: use of AI in policing, chaired by Policing Minister Nick Hurd, July 2019 FloraGuard outputs
Outreach : Cowell, C. Sajeva, M. Lavorgna, A. Middleton, S.E. Clarke, G. FloraGuard webinar, Royal Botanic Gardens, Kew, 2020, stakeholder analysis [314 registered, 170 attended live, 50 countries, major stakeholders such as DEFRA, WWF, US Dept of Justice, UN Office on Drugs and Crime (UNODC), European Commission and CITES] vimeo 1h 30mins duration
Steering Committees, Panels, Session Chair, Editorial Positions
Area Chair - ACL 2023
Research grants - data from 2015 (£578k UoS as PI; £8.7M UoS as CoI)
UKRI grants for Stuart E. Middleton
Development of Advanced Wing Solutions (DAWS2) : an InnovateUK funded Grant (CoI £42.1M total, UoS split £1.1M). Objectives include Large Language Model (LLM) based engineering digital assistant and co-pilot applications to support the next generation of aircraft wing design.
UKRI Centre for Doctoral Training in Machine Intelligence for Nano-electronic Devices and Systems (MINDS CDT) : an EPSRC funded Grant (CoI £5.8M UoS EP/S024298/1). The MINDS CDT operates as a centre of training excellence for the next generation of systems that employ Artificial Intelligence (AI) algorithms in low-cost/low-power device technologies (hardware-enabled AI).
ProTechThem : an ESRC funded project (CoI £770k UoS ES/V011278/1). ProTechThem will explore sharenting (parents sharing online information about minors). Motivation for sharenting and automated detection of risk behaviours online will be explored through online ethnography, criminological analysis and multi-lingual few-shot NLP algorithms to support improvement to cybersecurity behaviours.
SafeSpacesNLP : an UKRI TAS Hub funded project (PI). Behaviour classification NLP in a socio-technical AI setting for online harmful behaviours for children and young people. Exploring human-in-the-loop and graph-based NLP models for behaviour classification of online forum posts.
GloSAT : a UK NERC platform grant (PI £3.3M total, UoS split £260k NE/S015604/1). Global Surface Air Temperature (GloSAT) aims to improve understanding of climate variability and change. Objectives include multi-modal NLP for information extraction and data rescue of climate change sensors data from historical texts.
Gendered body language and speech styles in UK Parliament using machine learning : a Interdisciplinary Research Pump-Priming Fund project (CoI). NLP, audio processing and computer vision will be combined with political science methodologies to explore how gender mediates body language and speech styles in Parliamentary debates.
Multimodal audio-textual argumentation mining of political debates : a Web Science Institute grant (CoI). Development of a multimodel dataset for training NLP models to perform argument mining of political debates.
CYShadowWatch a UK DSTL funded project (PI £116k UoS ACC2005442). Automated Multilingual Information Extraction for Online Cybercrime Sites. CYShadowWatch explored NLP methods of statistical machine translation and information extraction applied to online Russian cybercrime forums.
FloraGuard project : an UK ESRC funded project (CoI £240k UoS ES/R003254/1). FloraGuard examined and mapped from a multidisciplinary perspective the criminal market in endangered plants affecting the UK, exploring human-in-the-loop NLP for interactive sensemaking to support law enforcement. Quantitative evidence came from a combination of surface (web forums, social media) and dark web (TOR forums) crawling of cyber-criminal activity; NLP & machine learning used to socio-economically map this activity at a community level.
Legal & Property Language Processing (LPLP) project : an Innovate UK funded project (PI £142k UoS 104875). LPLP developed cutting-edge NLP techniques to extract and analyse legal rights and obligations related to property and land. Objectives include the development of NLP algorithms to extract legal rights and obligations from Land Registry documents and the development of machine learning based legal risk models for property and land.
Intel-Analysis DSTL : a UK DSTL funded project (CoI £83k UoS ACC102157). Intel-Analysis DSTL used argumentation schemes and evidential reasoning to support teams of analysts trying to evaluate conflicting hypotheses during real-time events. Evidence was obtained in real-time from a combination of human intelligence reports and information extraction from social media via NLP.
REVEAL project : an EU funded FP7 project (CoI €688k UoS 610928). REVEAL advanced the necessary technologies for making a higher level analysis of social media possible. Focussed on social media verification, including NLP for digital text forensics, trust and credibility analytics and decision support for journalists verifying user generated content.
Digital Police Officer (DPO) project : a UK WSI funded project (PI). The DPO project aimed to apply linguistic analysis to identify cyber criminals operating under pseudonyms on different online forums and within the same forum. The project will apply NLP techniques guided by insights from criminology.
See the publications link for details of the above work.
Electronics and Computer Science