About the project
This PhD project aims to level up the current generative modelling techniques in computer vision. You'll address challenges related to language-vision integration and the transition from specialized task-oriented approaches to adaptable generalist models. Our research is aimed at unlocking the potential of generative models that combine expertise and versatility through language-vision integration.
The landscape of generative modelling in computer vision has undergone a remarkable transformation, evolving from conventional vision-based techniques to sophisticated language-vision models. This evolution has been propelled by the fusion of natural language understanding and image generation within computer vision systems, leading to ground-breaking research and practical applications like those using DALL-E3 with ChatGPT.
This integration enables machines not only to interpret visual data but also to generate contextually rich, human-like descriptions, blurring the lines between artificial intelligence and human cognition. Furthermore, this shift has given rise to versatile generalist models, capable of handling diverse tasks, necessitating innovative solutions to seamlessly integrate language and vision.
The Vision, Learning, and Control (VLC) group at the School of Electronics and Computer Science (ECS) is opening two PhD positions. We cordially invite individuals who are passionate to the realms of computer vision and machine learning to apply for these positions. The successful candidates will be engaged into the cutting-edge research within the supportive and collaborative environment of the VLC group.