Prof Stuart E. Middleton : Research

Research
Professional Engagement
Publications
Interests

Research areas

Stuart's research interest is in Natural Language Processing (NLP), specifically Large Language Models (LLMs), Information Extraction and Human-in-the-loop LLMs. In a juxtaposition to NLP areas where web-scale datasets are available, his research focuses on developing novel solutions to problems where training datasets are small, evolving, sparse or fragmented in nature. This can involve both finding new ways to fine-tune LLMs and researching novel methods to get the most out of smaller models.

Natural Language Processing - Large Language Models (LLMs)

Human-in-the-loop LLMs - Rationale-based learning, Active learning, Adversarial defence, Adversarial training, Interactive sense making.
Information Extraction - Text/Behaviour classification, Few/Zero shot learning, QA, RAG, Location extraction, Event extraction, Data rescue, Argument mining.

Domain expertise

Law enforcement, Defence, Mental Health, Environmental Science, Engineering, Social Science.

Examples of Impact and Outreach

Commercial : Innovate UK, Tackling challenges, building prosperity: The Industrial Strategy Challenge Fund, Orbital Witness: New technology spots key legal issues in real estate transactions, NLP research transfered to 2 person startup Orbital Witness helping them win £3.85 million venture capital during project lifetime, LPLP project 2021 Innovate UK link, PDF (see page 47)

Software : Middleton, S.E., geoparsing algorithm 'geoparsepy' is available open source from PyPI, averaging 1,500 downloads a month in 2021 [source pypistats.org] PyPI stats geoparsepy

Policy : Middleton, S.E., Invited AI Expert, UK Cabinet Office, London, Ministerial AI Roundtable: use of AI in policing, chaired by Policing Minister Nick Hurd, July 2019 FloraGuard outputs

Outreach : Cowell, C. Sajeva, M. Lavorgna, A. Middleton, S.E. Clarke, G. FloraGuard webinar, Royal Botanic Gardens, Kew, 2020, stakeholder analysis [314 registered, 170 attended live, 50 countries, major stakeholders such as DEFRA, WWF, US Dept of Justice, UN Office on Drugs and Crime (UNODC), European Commission and CITES] vimeo 1h 30mins duration

Steering Committees, Panels, Session Chair, Editorial Positions

Deputy Director - UKRI MINDS Centre for Doctoral Training [internship/sponsorship lead 2022 to 2024, deputy director 2024 to 2027]
Board Member - EPSRC Defence and Security Mobility DTP [MGT team 2025 to 2030]
Board Member - Centre for Machine Intelligence (CMI) 2023-2026
Full Member - EPSRC Peer Review College 2021+
Visiting Researcher - The Institute for Experimental AI, Northeastern University, USA 2024 to 2025
Invited Speaker - University of Oxford, Oxford, 2026
Invited Speaker - Safe and Trusted AI (STAI) Summer School, London, 2025
Invited Speaker - 2025 Workshop: Developing Future Views of the Veterans Mental Health Sector, London
Invited Speaker - Natural Language Processing Research Group, Sheffield, 2025
Organising Committee and Panel - RAI UK 2025 Workshop, Responsible AI for Mental Health, Charlotte, USA, 2025
Invited Expert - Roundtable Discussion with Senior Civil Servants, ‘Exploring the Role of AI in the Armed Forces’, Kings College London, 2024
Organising Committee and Panel - RAI UK 2024 Workshop, Responsible AI for Mental Health, London, 2024
Organising Committee and Workshop Co-chair - AIUK 2024 Workshop: AI for Data Rescue
Invited Expert - TAS/RUSI workshop 2024 on 'Using AI in an Intelligence Context: Future Scenario Workshop', London, 2024
Area Chair - ACL 2023
Organising Committee and Workshop Co-chair - AIUK 2023 & AI Fest 5 Workshop - AI and Defence: Readiness, Resilience and Mental Health
Turing Fellow - 2021 to 2023
Organising Committee and Workshop Co-chair - RUSI and UKRI TAS Hub conference, Trusting Machines? Cross-sector Lessons from Healthcare and Security 2021
Sector Leads Committee - UKRI Trustworthy Autonomous Systems (TAS) Hub - 2020 to 2024
Guest Editor - MDPI Sensors journal 2021 special issue 'Sensors Application on Early Warning System'
Session Chair - ECAI 2020
Steering Committee (chair) - ACM WebSci'20 Workshop 2020, Socio-technical AI systems for defence, cybercrime and cybersecurity
Session Chair - ACM WebSci 2020
Invited Expert - UK Cabinet Office Ministerial AI Roundtable event 2019 on 'use of AI in policing', London, 2019
Invited Expert - ATI/DSTL workshop 2019 on 'Decision Support for Military Commanders', London, 2019
Steering Committee - RGS-IBG Annual Conference 2018, Using New Forms of Data in Research Session Convenor
Steering Committee Short paper/demo Chair - IEEE International Conference on Intelligent Environments [IE] 2016 Posters & Short Paper Track Chair
Steering Committee - MediaEval Benchmarking Initiative for Multimedia Evaluation [MediaEval] 2016 Verifying Multimedia Use Task Committee
Invited expert - BBC South Today

Collaborations

The Institute for Experimental AI, Northeastern University, USA

Centre for Machine Intelligence

Rebooting Democracy: Democratic Innovation for the Information Age

Centre for Democratic Futures

Grants

Research summary £707k UoS as PI; £4.69M UoS as CoI, £49.84M total project funding (all partners)
Education summary £6.1M UoS as CoI
UKRI grant link for Stuart E. Middleton

UMW: Uncharted Maritime Worlds : a ERC funded Grant (CoI £2.4M UoS). UMW will develop an integrated digital history of seaborne trade from European archives c.1550‒c.1800. Objectives include use of LLMs to perform data rescue and analysis of scanned historical maritime records.

DR-Africa: Data Rescue Africa : a MetOffice funded Grant (PI £201k UoS £358k total). DRAfrica will explore NLP models for automated observation transcription in the context of downstream applications such as climate change modelling.

Development of Advanced Wing Solutions (DAWS2) : an AIT/InnovateUK funded Grant (CoI UoS £1.132M £42.1M total 10079510). Objectives include Large Language Model (LLM) based engineering digital assistant and co-pilot applications to support the next generation of aircraft wing design.

Exploring Fairness and Bias of Multimodal Natural Language Processing for Mental Health, an International Partnerships Project (CoI £64k UoS, RAI UK EP/Y009800/1). Partnership project between the University of Southampton and Northeastern University focused on the responsible use of AI in addressing mental health issues.

UKRI Centre for Doctoral Training in Machine Intelligence for Nano-electronic Devices and Systems (MINDS CDT) : an EPSRC funded Grant (CoI £6.1M UoS EP/S024298/1). The MINDS CDT operates as a centre of training excellence for the next generation of systems that employ Artificial Intelligence (AI) algorithms in low-cost/low-power device technologies (hardware-enabled AI).

ProTechThem : an ESRC funded project (CoI £757k UoS ES/V011278/1 ). ProTechThem will explore sharenting (parents sharing online information about minors). Motivation for sharenting and automated detection of risk behaviours online will be explored through online ethnography, criminological analysis and multi-lingual few-shot NLP algorithms to support improvement to cybersecurity behaviours.

SafeSpacesNLP : an UKRI TAS Hub funded project (PI 1.25 FTE, UKRI TASHub). Behaviour classification NLP in a socio-technical AI setting for online harmful behaviours for children and young people. Exploring human-in-the-loop and graph-based NLP models for behaviour classification of online forum posts.

GloSAT : a UK NERC platform grant (PI UoS £256k £3.3M total NE/S015604/1). Global Surface Air Temperature (GloSAT) aims to improve understanding of climate variability and change. Objectives include multi-modal NLP for information extraction and data rescue of climate change sensors data from historical texts.

Gendered body language and speech styles in UK Parliament using machine learning : a Interdisciplinary Research Pump-Priming Fund project (CoI). NLP, audio processing and computer vision will be combined with political science methodologies to explore how gender mediates body language and speech styles in Parliamentary debates.

Multimodal audio-textual argumentation mining of political debates : a Web Science Institute grant (CoI £13k UoS). Development of a multimodel dataset for training NLP models to perform argument mining of political debates.

CYShadowWatch a UK DSTL funded project (PI £116k UoS £133k total ACC2005442). Automated Multilingual Information Extraction for Online Cybercrime Sites. CYShadowWatch explored NLP methods of statistical machine translation and information extraction applied to online Russian cybercrime forums.

FloraGuard project : an UK ESRC funded project (CoI £237k UoS ES/R003254/1). FloraGuard examined and mapped from a multidisciplinary perspective the criminal market in endangered plants affecting the UK, exploring human-in-the-loop NLP for interactive sensemaking to support law enforcement. Quantitative evidence came from a combination of surface (web forums, social media) and dark web (TOR forums) crawling of cyber-criminal activity; NLP & machine learning used to socio-economically map this activity at a community level.

Legal & Property Language Processing (LPLP) project : an Innovate UK funded project (PI £114k UoS £313k Total ref 104875). LPLP developed cutting-edge NLP techniques to extract and analyse legal rights and obligations related to property and land. Objectives include the development of NLP algorithms to extract legal rights and obligations from Land Registry documents and the development of machine learning based legal risk models for property and land.

Intel-Analysis DSTL : a UK DSTL funded project (CoI £83k UoS ACC102157). Intel-Analysis DSTL used argumentation schemes and evidential reasoning to support teams of analysts trying to evaluate conflicting hypotheses during real-time events. Evidence was obtained in real-time from a combination of human intelligence reports and information extraction from social media via NLP.

REVEAL project : an EU funded FP7 project (CoI €688k UoS €6.5M total 610928). REVEAL advanced the necessary technologies for making a higher level analysis of social media possible. Focussed on social media verification, including NLP for digital text forensics, trust and credibility analytics and decision support for journalists verifying user generated content.

Digital Police Officer (DPO) project : a UK WSI funded project (PI). The DPO project aimed to apply linguistic analysis to identify cyber criminals operating under pseudonyms on different online forums and within the same forum. The project will apply NLP techniques guided by insights from criminology.

See the publications link for details of the above work.

Electronics and Computer Science


	University of Southampton > ECS > Prof Stuart E. Middleton : Home Page > Research