Hummingbird Project Wrap-Up
Humans are remarkably good and recognising one another from both the face and the voice. However, humans are not alone in being able to perform these tasks. Computer algorithms can now identify faces and voices, with performance approaching (and sometimes exceeding) that of human perceivers under certain conditions. The aim of the Hummingbird Project was to (i) compare performance of humans and computer algorithms on face and voice matching (verification) tasks and (ii) determine whether building in human-like strategies could improve algorithm transparency and performance.
The first phase of the Hummingbird Project focused on face recognition. The goal of this phase of work was to identify the strategies used by humans during face recognition and then to find innovative ways of incorporating these strategies into the computer algorithms in the hope of enhancing capability. A broad survey of the human face recognition literature revealed four factors of importance: facial distinctiveness, facial familiarity, internal feature weighting, and the use of 3D information. While facial distinctiveness and familiarity showed little effect on algorithm performance, differentially weighting internal features of the face – the t-shaped region of the face encompassing the eyes, nose, and mouth – produced a robust improvement in performance. Similarly, generating a 3D face from a single 2D image proved effective in correcting for variations in viewpoint across different images. Thus, the strengths of the human perceiver can successfully be captured in an automated process, improving both transparency and capability. For a full report of the findings, follow this link .
The second phase of the Hummingbird Project focused on voice recognition. The goal of this phase of work was to test human and algorithm performance under different listening conditions and determine whether the strategies used by humans could be incorporated into the algorithm to enhance capability. Experimental studies of human voice recognition revealed the vulnerability of unfamiliar voice matching to changes in speech style, background noise, telephone, disguise, and temporal reversal. In contrast, the algorithms arrived at the correct identity decision 100% of the time. Nonetheless, the degree of similarity between the enrolment and verification voice samples (indicated by a ‘similarity’ or ‘distance’ score) was adversely affected by a change in speech style, background noise, telephone, and disguise in a similar pattern to that observed in human perceivers. A notable difference between the human perceivers and algorithms was observed for unintelligible voice samples. While human performance was significantly worse of unintelligible speech clips, the algorithm was unaffected. The fact that that the computer algorithms outperformed humans under all listening conditions tested, poses an interesting question to the Hummingbird team: given that humans perform relatively poorly on voice recognition tasks, is it wise to build human-like strategies into the algorithm? Would this help or hinder performance? Work on this is question is underway and will extend until March 2020.
Although the Hummingbird Project will come to a close soon, these initial findings demonstrate that human-like strategies can be successfully incorporated into computer algorithms in order to improve transparency and performance. On-going work will explore fusion between human and machine decision-makers. These findings provide valuable information to the biometrics community in situations such as border patrol (faces) and telephone banking (voices) where algorithms are regularly being used to verify one’s identity. Not only do these findings demonstrate the conditions in which the algorithms may be vulnerable, but they also suggest ways of incorporating human-like strategies in order to overcome these vulnerabilities.