In the 17th century, Swiss poet and preacher Johann Kaspar Lavater brought back an ancient “science.” Physiognomy, the idea of judging someone’s personality based on their face, was not a new idea. It had been discussed by some ancient philosophers but fell into disrepute because, well, it didn’t work. Still, Lavater brought it back.
He wrote several popular essays, drawing mixed reactions but ultimately popularizing physiognomy, especially in the field of criminology. Even though physiognomy didn’t work, it found its proponents. It was useful as a tool for segregation and to push the idea that some races are better than others. Lavater’s bias towards European physiology led to racial stereotyping. He would write, for instance, that Jewish features were a sign of “neither generosity, nor tenderness, nor elevation of mind.”
European colonialists and race theorists used this to argue for the supposed superiority of white Europeans, claiming that other races and ethnicities reflect inferiority and criminality. Before falling firmly into the bin of junk science, physiognomy was used to influence eugenics and fascist ideologies.
Now, machine learning is bringing this back.
Your face is not who you are
Proponents of neo-physiognomy argue that deep neural networks (DNNs) uncover correlations that human judgment cannot. Purportedly, they can achieve unprecedented accuracy in identifying latent traits. However, the allure of high-performance metrics masks a deeper issue: the lack of scientific legitimacy.
“We hold up the renewed emergence of physiognomic methods, facilitated by ML, as a case study in the harmful repercussions of ML-laundered junk science,” write the authors of a new study.
“Research in the physiognomic tradition goes back centuries, and while the methods largely fell out of favor with the downfall of the Third Reich, the prospects of ML have renewed scientific interest in the subject,” they add.
Machine learning systems, particularly those involving DNNs, are exceptionally adept at detecting patterns in data. Yet, their results are only as valid as the data and assumptions they are built upon.
These models often fail to address confounding variables, overfit to biased datasets, and lack causal mechanisms for their claims. For example, studies claiming to infer criminality or sexual orientation from facial features base their conclusions on datasets that may reflect societal biases rather than inherent traits. A notable example includes training data scraped from social media, laden with cultural and contextual biases.
Resurgence of a pseudoscience
A recent surge of deep learning-based studies have claimed the ability to predict unobservable latent character traits, including homosexuality, political ideology, and criminality, from photographs of human faces or other records of outward appearance,” write the authors, before giving a few recent examples. “In response, government and industry actors have adapted such methods into technologies deployed on the public in the form of products such as Faception, Hirevue, and Turnitin.”
Take, for instance, one recent study supporting the ability of algorithms to detect whether someone is homosexual simply by looking at their face. The reasoning in this approach is circular: the gender-prototypical facial morphology is predefined by the same measures as utilized in the original sexuality classification task, the authors of the new research explain.
The problem of cofounding factors is also well-illustrated with this example. The authors of the homosexuality physiognomy study told participants (college students) to keep their chin perfectly straight at 90 degrees. “It might be ventured that the average 19-year-old falls short of a perfectly calibrated proprioceptive sensibility of a 90 [degree] chin-to-body angle,” quip Mel Andrews, Andrew Smart, and Abeba Birhane, the authors of the new study.
Moreover, these technologies pose existential risks to marginalized communities. Studies claiming to identify sexual orientation or gender from facial features could be weaponized in regions where LGBTQ+ identities are criminalized. Despite claims by researchers that these tools are meant as warnings, their very existence enables opportunistic actors to exploit them.
AI-powered physiognomy can be very dangerous
The pseudoscientific framework of physiognomy laid a foundation for systemic racism, influencing fields like anthropology, criminology, and eugenics. These biases were embedded into societal structures and used to justify slavery, segregation, and colonialism. While physiognomy has been debunked as junk science, its legacy persists — and this is apparent in the new wave of AI physiognomy studies.
We already know that AI datasets are often biased and this can induce misleading responses. The coupling of AI and physiognomy risks perpetuating and amplifying biases by using flawed and pseudoscientific principles to make judgments about individuals based on their appearance.
AI systems trained on biased datasets can inherit societal stereotypes, leading to discriminatory outcomes in areas like hiring, policing, and social services. For instance, it’s not hard to see how an algorithm claiming to detect whether someone is gay could be weaponized.
This issue is compounded by the culture of ML itself. A focus on rapid innovation, minimal peer review, and the commodification of research outputs creates an environment where flawed methodologies can thrive. Whereas most sciences emphasize demonstrable domain expertise and rigorous peer evaluation, ML research often bypasses these safeguards.
The reanimation of physiognomy serves as a cautionary tale of what happens when technology is wielded without accountability. It’s a reminder that progress in science is measured not just by what we can achieve, but by the wisdom with which we achieve it.
This is far from just a hypothetical problem. Its effects are already happening now.
“Authoritarian governments already actively use such technologies to suppress dissent and repress human rights,” the authors conclude.